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98 Human Secreted Proteins 

Field of the Invention 

This invention relates to newly identified polynucleotides and the polypeptides 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
5 their production. 

Background of the Invention 
Unlike bacterium, which exist as a single compartment surrounded by a 
• membrane, human cells and other eucaryoies are subdivided by membranes into.many 
functionally distinct compartments. Each membrane-bounded compartment, or 

10 organelle, contains different proteins essential for the function of the organelle. The 
cell uses "sorting signals." which are amino acid motifs located within the protein, to 
target proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a 
leader sequence, directs a class of proteins to an organelle called the endoplasmic 

15 reticulum (ER). The ER separates the membrane-bounded proteins from all other 
types of proteins. Once localized to the ER, both groups of proteins can be further 
directed to another organelle called the Golgi apparatus. Here, the Golgi distributes 
the proteins to vesicles, including secretory vesicles, the cell membrane, lysosomes, 
and the other organelles. 

20 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the 
extracellular space - a process called cxocytosis. Exocytosis can occur constitutively 
or after receipt of a triggering signal. In the latter case, the proteins are stored in 

25 secretory vesicles (or secreiorv' granules) until exocytosis is triggered. Similarly, 
proteins residing on the cell membrane can also be secreted into the extracellular 
space by proteolytic cleavage of a "linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins 

30 include the commercially valuable human insulin, interferon. Factor VIII, human 
growth hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of 
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the pervasive role of secreted proteins in human physiology, a need exists for 
identifying and characterizing novel human secreted proteins and the genes that 
encode them. This knowledge will allow one to detect, to treat, and to prevent 
medical disorders by using secreted proteins or the genes that encode them. 

Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, 
antibodies, and recombinant methods for producing the polypeptides and 
polynucleotides. Also provided are diagnostic methods for detecting disorders related 
to the polypeptides, and therapeutic methods for treating such disorders. The 
invention further relates to screening methods for identifying binding partners of the 
polypeptides. 

Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 
environment (e.g.. the natural environment if it is naturally occurring), and thus is 
altered ''by the hand of man"' from its natural state. For e.xample. an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. 

In the present invention, a "secreted" protein refers to those proteins capable 
of being directed to the ER. secretory vesicles, or the extracellular space as a result of 
a signal sequence, as well as those proteins released into the extracellular space 
without necessarily containing a signal sequence. If the secreted protein is released 
into the extracellular space, the .secreted protein can undergo extracellular processing 
to produce a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocyiosis and proteolytic cleavage. 
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In specific embodiments, the polynucleotides of the invention are less than 
300 kb. 200 kb, 100 kb, 50 kb, 15 kb, 10 kb. or 7.5 kb in length. In a further 
embodiment, polynucleotides of the invention comprise at least 15 contiguous 
nucleotides of the coding sequence, but do not comprise all or a portion of any intron. 
In another embodiment, the nucleic acid comprising the coding sequence does not 
contain coding sequences of a genomic flanking gene fi:e., 5' or 3' to the gene in the 
genome). 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone 
deposited with the ATCC. For example, the polynucleotide can contain the 
nucleotide sequence of the full length cDN A sequence, including the 5' and 3' 
untranslated sequences, the coding region, with or without the signal sequence, the 
secreted protein coding region, as well as fragments, epitopes, domains, and variants 
of the nucleic acid sequence. Moreover, as used herein, a "polypeptide" refers to a 
molecule having the translated amino acid sequence generated from the 
polynucleotide as broadly defined. 

In the present invention, the full length sequence identified as SEQ ID NO:X 
was often generated by overlapping sequences contained in multiple clones (contig 
analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 10801 University Boulevard. 
Manassas, Virginia 201 10-2209. USA. The ATCC deposit was made pursuant to the 
terms of the Budapest Treaty on the international recognition of the deposit of 
microorganisms for purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those 
polynucleotides capable of hybridizing, under stringent hybridization conditions, to 
sequences contained in SEQ ID NO:X, the complement thereof, or the cDNA within 
the clone deposited with the ATCC. "Stringent hybridization conditions" refers to an 
overnight incubation at 42° C in a solution comprising 50% formamide, 5\ SSC (750 
mM NaCK 75 mM sodium citrate), 50 rhM sodium phosphate (pH 7.6), 5x Denhardt's 



wo 00/06698 PCT/IJS99/1 7 130 



solution. 10% dexiran sulfate, and 20 (ig/ml denatured, sheared salmon sperm DNA, 
followed by washing the filters in O.lx SSC at about bS^'C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
5 Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower 
percentages of formamide result in lowered stringency); salt conditions, or 
temperature. For example, lower stringency conditions include an overnight 
incubation at 37°C in a solution comprising 6X SSPE {20X SSPE = 3M NaCl: 0,2M 

10 NaH.PO,; 0.02M EDTA, pH 7.4). 0.5% SDS, 30% formamide. 100 ug/ml salmon 
sperm blocking DNA: followed by washes at 50°C with IXSSPE. 0. 1% SDS. In 
addition, to achieve even lower stringency, washes performed following stringent 
hybridization can be done at higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 

15 inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 
Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 
commercially available proprietarj' formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, 

20 due to problems with compatibility. 

Of course, a polynucleotide which hybridizes only to polyA+ sequences (such 
as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 

25 molecule containing a poly (A) stretch or the complement thereof (e.g., practically, 
any double-stranded cDNA clone). 

The polynucleotide of the present invention can be composed of any 
polyribonucleotide or polydeoxribonucleotide. which may be unmodified RNA or 
DNA or modified RNA or DNA. For example, polynucleotides can be composed of 

30 single- and double-stranded DNA, DNA that is a mixture of single- and double- 
stranded regions, single- and double-stranded RNA, and RNA that is mixture of 
single- and double-stranded regions, hybrid molecules comprising DNA and RNA 



wo 00/06698 PCTAjS99/l 7 1 30 

5 

that may be single-stranded on more typically, double-stranded or a mixture of single- 
and double-stranded regions. In addition, the polynucleotide can be composed of 
triple-stranded regions comprising RNA or DNA or both RNA and DNA. A 
polynucleotide may also contain one or more modified bases or DNA or RNA 
backbones modified for stability or for other reasons. "Modified" bases include, for 
example, tritylated bases and unusual bases such as inosine. A variety of 
modifications can be made to DNA and RNA; thus, "polynucleotide" embraces 
chemically, enzymatically, or metabolically modified forms. 

The polypeptide of the present invention can be composed of amino acids 
joined to each other by peptide bonds or modified peptide bonds, i.e., peptide 
isosteres, and may contain amino acids other than the 20 gene-encoded amino acids. 
The polypeptides may be modified by either natural processes, such as 
posttranslational processing, or by chemical modification techniques which are well 
known in the art. Such modifications are well described in basic texts and in more 
detailed monographs, as well as in a voluminous research literature. Modifications 
can occur anywhere in a polypeptide, including the peptide backbone, the amino acid 
side-chains and the amino or carboxyl termini. It will be appreciated that the same 
type of modification may be present in the same or varying degrees at several sites in 
a given polypeptide. Also, a given polypeptide may contain many types of 
modifications. Polypeptides may be branched , for example, as a result of 
ubiquitination. and they may be cyclic, with or without branching. Cyclic, branched, 
and branched cyclic polypeptides may result from posttranslation natural processes or 
may be made by synthetic methods. Modifications include acetylation, acylation. 
ADP-ribosylation, amidation. covalent attachment of flavin, covalent attachment of a 
heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent 
attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, 
cross-linking, cyclization, disulfide bond formation, demethylation, formation of 
covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, 
gamma-carboxylaiion. glycosylation, GPI anchor formation, hydroxylation, 
iodination, methylation, myristoylation. oxidation, pegylation. proteolytic processing, 
phosphorylation, prenylation. racemization. selenoylation. sulfation, transfer-RNA 
mediated addition of amino acids to proteins .such as arginylation. and ubiquitination. 
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(See, for instance, PROTEINS - STRUCTURE AND MOLECULAR PROPERTIES, 
2nd Ed, T. E. Creighton, W. H. Freeman and Company, New York (1993); 
POSTTRANSLATION AL COVALENT MODIFICATION OF PROTEINS, B. C. 
Johnson. Ed., Academic Press. New York, pgs. 1-12 ( 1983); Seifier et al., Meth 
5 Enzymol 182:626-646 (1990); Rattan et al., Ann NY AcadSci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO: Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
10 activity similar, but not necessarily identical to, an activity of a polypeptide of the 
present invention, including mature forms, as measured in a particular biological 
assay, with or without dose dependency. In the case where dose dependency does 
exist, it need not be identical to that of the polypeptide, but rather substantially similar 
to the dose-dependence in a given activity as compared to the polypeptide of the 
15 present invention (i.e., the candidate polypeptide will exhibit greater activity or not 
more than about 25-fold less and, preferably, not more than about tenfold less 
activity, and most preferably, not more than about three-fold less activity relative to 
the polypeptide of the present invention.) 

20 Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

The translation product of this gene is a human glycoprotein-associated amino 
acid transponer (See, e.g., Genbank Accession No. embiCAA10198.1l (AJ130718): 

25 all references available through this accession are hereby incorporated by reference 
herein). Amino acid transport across cellular membranes is mediated by multiple 
transporters with overlapping specificities. The transport system L, which mediates 
Na-i-independent exchange of large neutral amino acids, consists of a novel amino 
acid permease-related protein (LATl or AmAT-L-lc ) which for surface expression 

30 and function requires formation of disulfide-linked heterodimers with the 

glycosylated heavy chain of the h4F2/CD98 surface antigen. h4F2hc also associates 
with other manmialian light chains* e.g. y+LATl from mou.se and human which are 



wo 00/06698 



7 



PCT/US99/17130 



approximately 48% identical with LATl and thus belong to the same family of 
glycoprotein-associaied amino acid transporters. 

The novel heterodimers form exchangers which mediate the cellular efflux of 
cationic amino acids and the Na+-dependent uptake of large neutral amino acids. 
These transport characteristics and kinetic and pharmacological fingerprints identify 
them as y+L-type transport systems. mRNA encoding my+LATl is detectable in most 
adult tissues and expressed at high levels in kidney cortex and intestine. This indicates 
that the y+LATl-4F2hc heterodimer, besides participating in amino acid 
uptake/secretion in many cell types, is the basolateral amino acid exchanger involved 
in transepithelial reabsorption of cationic amino acids: hence, its defect might be the 
cause of the human genetic disease lysinuric protein intolerance. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
14. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 14. 

Preferred polypeptides comprise the following amino acid sequence: 
LALYSALFSYSGWDTLN (SEQ ID NO: 237), VTEEIKJsTERNLPL (SEQ ID NO: 
238), IGISMPIVT (SEQ ID NO: 239), lYILTNVAYYTVL (SEQ ID NO: 240), SDA 
VAVTFADQ (SEQ ID NO: 241), VALSCFGGLNASI (SEQ ID NO: 242), SRLFFV 
GSREGHLPD (SEQ ID NO: 243). SFSYWFFVGLS (SEQ ID NO: 244). VGQLYLR 
WKEP(SEQ ID NO: 245), RPRPLKLSVFFPIVFC (SEQ ID NO: 246), DTINSLIGI 
(SEQ ID NO: 247), LLAAACICLLTFINCAYVKWGTLVQDIFTYAKVLALIAVI 
VAGIVRLGQGASTHFENSFEGSSFAVGDIALALYSALFSYSGWDTLNYVTEEI 
KNPERNLPLSIGISMPIVTIIYILTNVAYYTVLDMRDILASDAVAVTFADQIFGIF 
NWIIPLSVALSCFGGLNASIVAASRLFFVGSREGHLPDAICMIHVERFTPVPSLL 
FNGIMALIYLCVEDIFQLINYYSFSYWFFVGLSIVGQLYLRWKEPDRPRPLKLS 
VFFPIVFCLCTIFLVAVPLYSDTINSLIGIAIALSGLPFYFLIIRVPEHKRPLYLRRI 
VGSATRYLQVLCMSVAAEMDLEDGGEMPKQRDPKSN (SEQ ID NO: 249) 
and/or ATALPPKJVGSATRYLQVLCMSVAAEMDLEDGGEMPKQRDPKSN 
(SEQ ID NO: 248). Polynucleotides encoding these polypeptides are also provided. 

Contact of cells with supernatant expressing the product of this gene has been 
shown to increase the permeability of the plasma membrane of THP-1 monocyte cells 
to calcium. Thus, it is likely that the product of this gene is involved in a signal 
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transduction pathway that is initiated when the product binds a receptor on the surface 
of the plasma membrane of both THP-1 monocytes, in addition to other cell-lines or 
tissue cell types. Thus, polynucleotides and polypeptides have uses which include, but 
-are not limited to, activating monocytes. 

This gene is expressed primarily in endothelial cells and brain, and, to a lesser 
extent, in a wide variety of human tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, disorders of the neural or gastrointesinal systems. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the lissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the circulation 
system or central nervous system, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., neural, 
gastrointestinal, cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 124 as residues: Glu-102 to Asn-1 10, Arg-256 to 
Leu-266, Pro-316 to Trp-328, Pro-331 to Arg-336, Met-350 to Gly-358. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in brain combined with its homology to a amino acid 
transporter and biological activity of increasing ion flux in monocytes indicates 
polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of neurodegenerative disease states, behavioral 
disorders, or inflammatory conditions. Representative uses are described in the 
"Regeneration" and "Hyperproliferative Disorders" sections below, in Example 1 1. 
15, and 18. and elsewhere herein. Briefly, the uses include, but are not limited to the 
detection, treatment, and/or prevention of Alzheimer's Disea.sc, Parkinson's Disease, 
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Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
5 learning disabilities, ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation. 

10 neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 

15 utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 1 1 and may have been publicly available prior to conception of 
the pre.sent invention. Preferably, such related polynucleotides are specifically 

20 excluded from the scope of the pre.sent invention. To list ever\' related sequence is 

cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1550 of SEQ ID NO: 11. b is an 
integer of 15 to 1564, where both a and b correspond to the positions of nucleotide 

25 residues shown in SEQ ID NO: 1 1 . and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

The gene encoding the disclosed cDNA is believed to reside on the X 
chromosome. Accordingly, polynucleotides related to this invention are useful as a 
30 marker in linkage analysis for the X chrorriosome. 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: 
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AARGSGVRDPLEEAVCPFSDLQLHAGRTTALFKAVRQGHLSLQRLLLSFVCL 
CPAPRGGAYRGRQASLSCGGLHPVRASRLLCLPKQAWAMAGAPPPVSLPPCS 
LISDCCASNQRDSVG (SEQ ID NO: 250). Polynucleotides encoding these 
polypeptides are also provided. 
5 This gene is expressed primarily in cord blood cells, and, to a lesser extent, in 

frontal lobe of the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

10 not limited to, developmental, reproductive, hematopoietic or neural disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, panicularly of the 
immune and central nervous systems, expression of this gene at significantly higher or 

15 lower levels is routinely detected in certain tissues or cell types (e.g., immune, 
developmental, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
amniotic fluid, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue 
or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

20 fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 125 as residues: His-56 to Gln-65, Leu-80 to Ile-85. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in cord blood cells indicates polynucleotides and 

25 polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
hematopoietic related disorders such as anemia, pancytopenia, leukopenia, 
thrombocytopenia or leukemia since stromal cells are important in the production of 
cells of hematopoietic lineages. Representative uses are described in the "Immune 
Activity" and "infectious disease" sections below, in Example II, 13, 14, 16. 18, 19, 

30 20, and 27, and elsewhere herein. Briefly, the uses include bone marrow cell ex-vivo 
culture, bone marrow transplantation, bone marrow rcconstitution, radiotherapy or 
chemotherapy of neoplasia. 
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The gene product may also be involved in lymphopoiesis, therefore, it can be 
used in immune disorders such as infection, inflammation, allergy, immunodeficiency 
etc. In addition, this gene product may have commercial utility in the expansion of 
stem cells and committed progenitors of various blood lineages, and in the 
5 differentiation and/or proliferation of various cell types. Alternatively, 

polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of cancer and other proliferative disorders. Expression within 
embryonic tissue and other cellular sources marked by proliferating cells indicates 
that this protein may play a role in the regulation of cellular division. Similarly, 

10 embryonic development also involves decisions involving cell differentiation and/or 
apoptosis in pattern formation. Thus, this protein may also be involved in apoptosjs or 
tissue differentiation and could again be useful in cancer therapy. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 

15 interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/pr 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

20 related to SEQ ID NO: 12 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To li.st every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide .sequence de.scribed by the general 

25 formula of a-b, where a is any integer between 1 to 1743 of SEQ ID NO: 12, b is an 
integer of 15 to 1757. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 12. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 3 
30 This gene is expressed primarily in human T cell lymphomas. 

Therefore, polynucleotides and polypeptides of the invention are u.seful as 
reagents for differential identification of the tis.suc(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, T cell lymphoma, immunodeficiencies, in addition to other immune 
system disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell lype(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
"significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 126 as residues: Met-1 to Phe-10. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in human T cell lymphomas indicates polynucleotides 
and polypeptides corresponding to this gene are useful for the diagnosis and treatment 
of a variety of immune system disorders. Representative uses are described in the 
"Immune Activity" and "infectious disease" sections below., in Example 1 1, 13, 14, 
16, 18, 19, 20. and 27. and elsewhere herein. Briefly, the expression of this gene 
product indicates a role in regulating the proliferation: survival: differentiation: and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g., by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also used as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to lran.splanled organs and 
tissues, such as host-versus-grafi and graft-versus-host diseases, or autoimmunity 
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disorders, such as autoimmune infenility. lense tissue injury, demyeiination, systemic 
lupus eryihemaiosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood cells, or that 
5 recruits hematopoietic cells to sites of injury. In addition, this gene product may have 
conunercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. Furthermore, the protein may also be used to determine biological activity, 
raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify 

10 agents that modulate their interactions, in addition to its use as a nutritional 

supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker andy'or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

15 related to SEQ ID NO: 13 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever>' related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

20 formula of a-b. where a is any integer between 1 to 1 359 of SEQ ID NO: 1 3, b is an 
integer of 15 to 1373. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 1 3. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 
25 The protein product of this clone shares sequence homology with the C- 

teminus of a human N-acetylglucosamine-phosphate mutase (See, e.g., Genbank 
Accession No. gblAAC72409.1l (API 02265): all references available through this 
accession are hereby incorporated by reference herein.) Hofmann, et al. (Eur. J. 
Biochem. 221:741-747 (1994)) studied the N-acetylglucosamine-phosphate mutase of 
30 Saccharomyces cerevisiae and showed it to be essential for viability. A S. cerevisiae 
agml deletion mutant progres.sed through only approximately five cell cycles to form 
a string' of undivided cells with an abnormal cell morphology resembling 
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glucosamine auxotrophic mutants. Expression of the AGMl gene on a multi-copy 
plasmid led to a significantly increased N-acetylglucosamine-phosphate mutase 
activity. Unlike over-expression of the S. cerevisiae AGMl gene in a 
phosphoglucomutase (pgml delta/pgm2 delta) double deletion mutant which could 
5 restore phosphoglucomutase activity, over-expression of the PGM2 gene encoding the 
major isoenzyme of phosphoglucomutase did not increase N-acetylglucosamine- 
phosphate-muiase activity and did not restore growth of agml deletion mutant cells. 
These observations indicate that the different hexosephosphate mutases of S. 
cerevisiae have partially overlapping substrate specificities but. nevertheless, distinct 

10 physiological functions. The human N-acetylglucosamine-phosphate-mutase is 
expected to share at least some biological activities with the Agml protein. 

Preferred polypeptide t'ragments of the invention comprise the following 
amino acid sequences: LSKAFLDSPNRLLAVEM.NTDHLRLTVPNGIGALKLRXM 
EHYFSQGLSVQLFNDGSKGKLNHLCGADFVKSHQKPPQGMEIKSNERCCSFD 

15 GDADRIVYYYHDADGHFHLIDGDKJ.-^TLISSFLKELLVEIGESLNIGVVQTAYA 
NGSSTRYLEEVMKVPVYCTKTGVKHLHHKAQEFDIGVYFEANGHGTALFST 
AVEMKIKQSAEQLEDKKRKAAKMLENIIDLFNQAAGDAISDMLVIEAILALK 
GLTVQQWDALYTDLPNRQLKVQVADRRVISTTXAERQAVTPPGLQEAINDL 
VKKYKLSRAFVRPSGTEDVVRVYAEADSQESADHLAHEVSLAVFQLAGGIGE 

20 RPQPGF (SEQ ID NO: 251). LSKAFLDSPNRLLAVEMNTDHLRLTV fSEQ ID 

NO: 252), PNGIGALKLRX.MEHYFSQGLSVQLFNDG (SEQ ID SO: 253). SKGKL 
NHLCGADFVKSHQKPPQG.MEIKS (SEQ ID NO: 254). NERCCSFDGDADRIV 
YYYHDADGHFHLI (SEQ ID NO: 255). DGDKI.ATLISSFLKELLVEIGESLNIGV 
(SEQ ID NO: 256). VQTAYANGSSTRYLEEVMKVPVYCTKTG (SEQ ID NO: 

25 257). VKHLHHKAQEFDIGVYFEA.NfGHGTALFS (SEQ ID NO: 258). TAVEMK 
IKQSAEQLEDKKRKAAKMLENl (SEQ ID NO: 259). IDLFNQAAGDAISDM 
LVIEAILALKGLT (SEQ ID NO: 260). VQQWDALYTDLPNRQLKVQVADRR 
VIST (SEQ ID NO: 261). TX.AERQAVTPPGLQEAINDLVKKYKLSR (SEQ ID 
NO: 262). AFVRPSGTEDVVRVYAEADSQESA (SEQ ID NO: 263). and/or DH 

30 LAHEVSL A VFQL AGG IGERPQPGF (SEQ ID NO: 264). Polynucleotides encoding 
these polypeptides are also provided. 
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The gene encoding the disclosed cDNA is believed to reside on chromosome 
6. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 6. 

This gene is expressed primarily in fetal brain, and, to a lesser extent, in a 
5 wide variety of human tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental disorders, particularly of the central nervous system. 
10 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) .or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
central and peripheral nervous system, expression of this gene at significantly higher 
or lower levels is routinely detected in certain tissues or cell types (e.g., neural, or 
15 cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

20 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 127 as residues: Asn-36 to Lys-42. Lys-53 to Gln-60, 
Ile-64 to Ala-77, Ala- 128 to Tyr-135,.Lys-184 to Ala- 199, Leu-245 to Leu.250. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution of N-acetylglucosamine-phosphate mutase in fetal 

25 brain indicates polynucleotides and polypeptides corresponding to this gene are useful 
for the detection, treatment, and/or prevention of neurodegenerative disease states, 
behavioral disorders, or inflammatory conditions. Representative uses are described in 
the "Regeneration" and "Hyperproliferative Disorders" sections below, in Example 
1 1, 15, and 18, and elsewhere herein. Briefly, the uses include, but are not limited to 

30 the detection, treatment, and/or prevention of Alzheimer's Disease. Parkinson s 
Disease, Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, 
demyelinating disea,ses, peripheral neuropathies, neoplasia, trauma, congenital 
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malformations, spinal cord injuries, ischemia and infarction, aneurysms, hemorrhages, 
schizophrenia, mania, dementia, paranoia, obsessive compulsive disorder, depression, 
panic disorder, learning disabilities, ALS, psychoses, autism, and altered behaviors, 
including disorders in feeding, sleep patterns, balance, and perception. In addition, 
5 elevated expression of this gene product in regions of the brain indicates it plays a 
role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 

10 to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 

identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
Many polynucleotide sequences, such as EST sequences, are publicly 

15 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO: 14 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

20 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 3726 of SEQ ID NO: 14, b is an 
integer of 15 to 3740. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 14. and where b is greater than or equal to a + 14. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 5 

This gene is expressed primarily in human stomach and stomach tumor cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
30 not limited to, disorders of the gastrointestinal system, particularly cancer or ulcers of 
stomach tissue. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
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tissueis) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the digestive system, expression of this gene at significantly higher or 
lower levels is routinely detected in cenain tissues or cell types (e.g., gastrointestinal, 
or cancerous and wounded tissues) or bodily fluids (e.g.. bile, lymph, serum, plasma, 
5 urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of the stomach indicates that polynucleotides 

10 and polypeptides corresponding to this gene are useful for diagnosis, treatment and 
intervention of these tumors, in addition to other tumors where expression has been 
indicated. Additionally, the protein product of this gene may play a role in the normal 
function of the stomach and/or digestive system. Furthermore, the protein may also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 

15 cognate ligands or receptors, to identify agents that modulate their interactions, in 
addition to its use as a nutritional supplement. Protein, as well as. antibodies directed 
against the protein may show utility as a tumor marker and/or immunotherapy targets 
for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 

20 available and accessible through .sequence databases. Some of these sequences are 

related to SEQ ID NO: 15 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

25 more polynucleotides comprising a nucleotide sequence- described by the general 
formula of a-b, where a is any integer between 1 to 1 182 of SEQ ID NO: 15, b is an 
integer of 15 to 1 196. where both a and b correspond to the positions of nucleotide 
'residues shown in SEQ ID .NO: 15. and where b is greater than or equal to a + 14. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

Preferred polypeptides of the invention comprise the following amino acid 
.sequences: 
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FEIALPRESNITVLIKLGTPTLLAKPCYIVISKRHITMLSIKSGERIVFTFSCQSPE 
NHFVIEIQKNIDCMSGPCPFGEVQLQPSTSLLPTLNRTFIWDVKAHKSIGLELQ 
FSIPRLRQIGPGESCPDGVTHSISGRIDATVVRIGTFCSNGTVSRIKM (SEQ ID 
NO: 266), and/or GTRAAPGLGAWGRRSPPSFSPPRPRRPG VMAGLIS'CGVSIAL 
5 LGVLLLGAARLPRGAEAFEIALPRESNITVLIKLGTPTLLAKPCYIVISKRHITM 
LSIKSGERIVFTFSCQSPENHFVIEIQKiNIDCMSGPCPFGEVQLQPSTSLLPTLNR 
TFIWDVKAHKSIGLELQFSIPRLRQIGPGESCPDGVTHSISGRIDATVVRIGTFC 
SNGTVSRIKMQEGVKMALHLPWFHPRNVSGFSIANRSSIKRLCIIESVFEGEGS 
::ATLMSANYPEGFPEDELMTWQFVVPAHLRASVSFLNFNLSxNCERKEERVEYY 

10 IPGSTTNPEVFKLEDKQPGNMAGNFNLSLQGCDQDAQSPGILRLQFQVLVQH 
PQNESNKIYVVDLSNERAMSLTIEPRPVKQSRKFVPGCFVCLESRTCSSNLTLT 
SGSKHKISFLCDDLTRLWMNVEKP (SEQ ID NO: 265). Polynucleotides encoding 
these polypeptides are also prqvided. 

This gene is expressed primarily in placenta, and to a lesser extent in, prostate 

15 and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, male and female infenility. and associated disorders of the 

20 reproductive system. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, panicularly of the reproductive system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 

25 (e.g.. reproductive, or cancerous and wounded tissues) or, bodily fluids (e.g.. lymph, 
amniotic fluid, seminal fluid, serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e.. the expression level in healthy tissue or 
bodily fluid from an individual nor having the disorder. 

30 The tissue distribution of this gene in the prostate, placenta and ovary 

. indicates that polynucleotides and polypeptides corresponding to this gene are useful 
for treatment, prevention, and/or diagnosis of male or female infertility, endocrine 
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disorders* fetal deficiencies, ovarian failure, amenorrhea, ovarian cancer, benign 
prostate hyperplasia and prostate cancer. Similarly, the tissue distribution indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of cancer and other proliferative disorders. Expression within 
5 placental tissue and other cellular sources marked by proliferating cells indicates that 
this protein may play a role in the regulation of cellular division. Similarly, embryonic 
development also involves decisions involving cell differentiation and/or apoplosis in 
pattern formation. Thus, this protein may also be involved in apoptosis or tissue 
differentiation and could again be useful in cancer therapy. Furthermore, the protein 

10 may also be used to determine biological activity, to raise antibodies, as tissue 

markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

15 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 16 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever\' related sequence is 

20 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2195 of SEQ ID NO: 16, b is an 
integer of 15 to 2209, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 16, and where b is greater than or equal to a + 14. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

The translation product of this gene shares homology with the human and rat 
HNK-1 sulfotransferase protein (See, e.g., Genbank Accession Nos. gblAAB88 123.11 
(AF022729) and gil292l306lgblAAC04707.1l fAF033827): all references available 
30 through these accessions are hereby incorporated herein by reference.) Ong E, et al. fj 
Biol Chem. 273(9):5 190-5 ( 1998)) have characterized the human HNK-1 
sulfotransferase, and show that it is involved in the synthesis of the HNK-1 
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carbohydrate epitope which is expressed on various adhesion molecules in the 
nervous system and on immune cells (e.g., natural killer cells) and is suggested to 
play a role in cell-cell and cell-substratum interactions. Based on the sequence 
similarity, the translation product of this gene is expected to share at least some 
5 biological activities with HNK-1 sulfotransf erase proteins. Such activities are known 
in the art, some of which are described elsewhere herein, or in» for example, Bakker, 
et ah, J Biol Chem. 272:29942-6 (1997), incorporated herein by reference. Based on 
sequence similarity between sulfotransferases, a consensus sequence for the active 
site was developed (Ong, et al., supra). The consensus pattern is as follows: 

10 xxRPDzzzz, where x represents hydrophobic amino acid residues and z represents 
any amino acid residue. Therefore, 

Preferred polypeptides of the invention comprise the following amino acid 
sequences: FVRDPFVRL (SEQ ID NO: 267), FLFVRDPFVRLIS (SEQ ID NO: 
268), FLFVRDPFVRLISAF (SEQ ID NO: 269), and/or YLHTSFSRPHTGPPLPTPG 

15 PDRDRELTADSDVDEFLDKFLSAGVKQSDLPRKETEQPPAPGSMEENVRGY 
DWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERAFDDIPNSELSHL 
IVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLHRGAPYRDPLRIPREHVH 
NASAHLTFNKPWRRYGKLSRHLMKVKLKKYTKFLFVRDPFVRLISAFRSK 
FELENEEFYRKFAVPMLRLYANHTSLPASAREAFRAGLKVSFANFIQYLLDPH 

20 TEKLAPFNEHWRQVYRLCHPCQIDYDFVGKLETLDEDAAQLLQLLQVDRQ 
LRPPPSYRNRTASSWEEDWFAKIPLAWRQQLYKLYEADFVLFGYPKPENLL 
RD (SEQ ID NO: 270). Polynucleotides encoding these polypeptides are also 
provided. 

Further preferred are the sulfotransferase active site polypeptides listed above, 
25 and at least 5, 10, 15, 20, 25. 30, 50, or 75 additional contiguous amino acid residues 
of the sequence referenced in Table I for this gene. The additional contiguous amino 
acid residues is N-terminal or C- terminal to the sulfotransferase active site 
polypeptides. Alternatively, the additional contiguous amino acid residues is both N- 
terminal and C-terminal to the sulfotransferase active site polypeptides, wherein the 
30 total N- and C-terminal contiguous amino acid residues equal the specified number. 
The above preferred polypeptide domains are characteristic of a signature specific to 
sulfotransferase proteins. The nucleotides sequence of this gene was found to be 
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homologous 10 the human h\ poxanthine guanine phosphoribosyl transferase 2 cDNA 
which is know to be involved in the purine salvage pathway resulting in the 
maintainance of homeostatic levels of uric acid (See Genbank Accession No.T30127). 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
5 7. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 7. 

This gene is expressed to a very high level in HL-60 myelogenous leukemia 
cell lines, and to a lesser extent, in most cell types of the immune system. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
10 reagents for differential identification of the tissuefs) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. immune, myelopoiesis. and metabolic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
15 a number of disorders of the above tissues or cells, particularly of the immune and 
hematopoietic systems, expression of this gene at significantly higher or lower levels 
is routinely detected in certain tissues or cell types (e.g., immune, metabolic, or 
cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID SO: 130 as residues: Ser-39 to Gly-46. Leu-49 to Ala-62, 

25 Lys-79 to Ala-93, Gly-95 to Asp- 105. Ser-107 to Val-127. Gly-193 to Leu-200. Lys- 
218 to Ser-227, Lys-234 to Thr-239, Pro-366 to Asp-379. Pro-406 to Asp-414. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in HL-60 myelogenous leukemia cell lines and 
homology to HNK-1 sulfotransferase proteins indicates that polynucleotides and 

30 polypeptides corresponding to this gene are useful for the diagnosis, prevention 

and/or treatment of a variety of immune system disorders, including but not limited 
to. those involving the HNK-l carbohydrate epitope, (e.g. HNK-1 as an auto-antigen 
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in peripheral demyelinaiive neuropathy). Representative uses are described in the 
"Immune Activity " and "infectious disease" sections below, in Example 1 K 13, 14, 
16, 18, 19. 20, and 27. and elsewhere herein. Briefly, expression of this gene product 
in tonsils indicates a role in regulating the proliferation; survival; differentiation; 
5 and/or activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g., by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 

10 product is involved in immune functions. Therefore it is also used as an agent for 

immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 

15 tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 

disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, scleroderma and tissues.. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood cells, or that 

20 recruits hematopoietic cells to sites of injur>'. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. Alternatively, the homology to a conserved purine metabolisrn protein may 
suggest that polynucleotides and polypeptides corresponding to this gene are useful 

25 for the diagnosis, prevention, and/or treatment of various metabolic disorders such as 
Tay-Sach's Di.sease, phenylkenonuria. galactosemia, porphyrias, Hurler's syndrome, 
and various urogenital disorders related to metabolic conditions, particularly Lesch- 
Nyhan syndrome. Furthermore, the protein may also be used to determine biological 
activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors. 
.30 to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
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Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 17 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
5 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1760 of SEQ ID NO: 17. b is an 
integer of 15 to 1774, where both a and b correspond to the positions of nucleotide 
10 residues shown in SEQ ID NO: 17, and where b is greater than or equal to a + 14, 

FEATURES OF PROTEIN ENCODED BY GENE NO: 8 

When tested against Jurkat T-cell lines, supematants removed from cells 
containing this gene activated the gamma activating sequence (GAS) promoter 

15 element, GAS is a promoter element found upstream of many genes which are 

involved in the Jak-STAT pathway, a large, signal transduction pathway involved in 
the differentiation and proliferation of cells. Therefore, activation of the Jak-STAT 
pathway, reflected by the binding of the GAS element, can be used to indicate 
proteins involved in the proliferation and differentiation of cells. Thus, it is likely that 

20 this gene activates T-cells through the Jak-STAT signal transduction pathway. 

In a specific embodiment, polypeptides comprising the amino acid sequence 
of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: KLVRLQVPVRNSRVDPRVRS.KIGSRRWMLQLI 

25 MQLGSVLLTRCPFWGCFSQLMLYAERAEARRKPDIPVPYLYFDMGAAVLCA 
SFMSFGVKRRWFALGAALQLAISTYAAYIGGYVHYGDWLKVRMYSRTVAII 
GGFLVLASGAGELYRRKPRSRSLQSTGQVFLGIYLICVAYSLQHSKEDRLA 
YLNHLPGGELMIQLFFVLYGILALAFLSGYYVTLAAQILAVLLPPVMLLIDG 
NVAYWHNTRRVEFWNQMKLLGESVGIFGTAVILATDG (SEQ ID NO: 271). 

30 A preferred polypeptide fragment of the invention comprises the following 

amino acid sequence: MQLGSVLLTRCPFWGCFSQLMLYAERAEARRKPDIPVP 
YLYFDMGAAVLCASFMSFGVKRRWFALGAALQLAISTYAAYIGGYVHYGD 
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WLKVRMYSRTVAIIGGFLVLASGAGELYRRKPRSRSLQSTGQVFLGIYLICVA 
. YSLQHSKEDRLAYLNHLPGGELMIQLFFVLYGILAPGLSVRLLRDPRCPDPGC 
TAAPCHAAH fSEQ ID NO: 272). Polynucleotides encoding these polypeptides are 
also provided. 

5 The gene encoding the disclosed cDNA is believed to reside on chromosome 

17. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 17. 

This gene is expressed primarily in endometrial tumors, and to a lesser extent, 
in T-cells, pituitary and to a certain extent in most cell types. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, female reproductive, immune, or endocrine disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

15 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
and/or immune systems expression of this gene at significantly higher or lower levels 
is routinely detected in certain tissues or cell types (e.g.. immune, reproductive, or 
cancerous and wounded tissues ) or bodily fluids (e.g., amniotic fluid, lymph, serum, 

20 plasma, urine, synovial fluid and spinal fluid > or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 

25 epitopes shown in SEQ ID NO: 1 3 1 as residues: AIa'27 to Asp-34, Tyr- 1 16 to Leu- 
125, Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution predominantly in the endometrium indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of a range of immune and/or reproductive 

30 disorders including endometriosis, endometritis, and endometrioma. Similarly, the 
tissue distribution in T-cells and the ability of supematants expressing this gene to 
stimulate the GAS promoter element in T-cells indicates polynucleotides and 
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polypeptides corresponding to this gene are useful for the diagnosis and treatment of a 
variety of immune system disorders. Representative uses are described in the 
"Immune Activity" and "infectious disease" sections below, in Example 11, 13. 14, 
16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the expression of this gene 
5 product indicates a role in regulating the proliferation; survival; differentiation: and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g., by boosting 
immune responses). 

10 Since the gene is expressed in cells of lymphoid origin, the natural gene 

product is involved in immune functions. Therefore it is also used as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granuiomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 

15 ,such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 

20 factor that influences the differentiation or behavior of other blood cells, or that 

recruits hematopoietic cells to sites of injuiy. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. Alternatively, the tissue distribution in pituitar\' indicates polynucleotides and 

25 polypeptides corresponding to this gene arc useful for the detection, treatment, and/or 
prevention of various endocrine disorders and cancers. Representative uses are 
described in the "Biological Activity", "Hyperproliferative Disorders", and "Binding 
Activity" sections below, in Example 11, 17, 18, 19. 20 and 27. and elsewhere herein. 
Briefly, the protein can be used for the detection, treatment, and/or prevention of the 

30 Addison's Disease. Cushing s Syndrome, and disorders and/or cancers of the pancreas 
(e.g., diabetes mellitus), adrenal cortex, ovaries, pituitary (e.g.. hyper-. 
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hypopituitarism), thyroid (e.g.. hyper-, hypothyroidism), parathyroid (e.g.. hyper- 
,hypoparathyroidism). hypoihallamus, and testes. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
5 related to SEQ ID NO: 18 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
10 formula of a-b, where a is any integer between 1 to 1660 of SEQ ID NO: 18, b is an 
integer of 15 to 1674, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 18, and where b is greater than or equal lo a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

15 Contact of cells with supernatant expressing the product of this gene has been 

shown to increase the permeability of the plasma membrane of the myeloid leukemia 
cell line AML-193 to calcium. Thus, it is likely that the product of this gene is 
involved in a signal transduction pathway that is initiated when the product binds a 
receptor on the surface of the plasma membrane of myeloid leukemia cells, in 

20 addition to other cell-lines or tissue cell types. Thus, polynucleotides and polypeptides 
have uses which include, but are not limited to, activating myeloid leukemia cells. 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 

25 following amino acid sequence: SNEILLSFPQNYYIQWLNGSLIHGLWNLASLFS 
NLCLFVLMPFAFFFLESEGFAGLKKGIRARILETLVMLLLLALLILGIVWVAS 
ALIDNDAASMESLYDLWEFYLPYLYSCISLMGCLLLLLCTPVGLSRA4FTVMG 
HLLVKPTILEDLDEQIYIITLEEEALQRRLNGLSSSVEYNIMELEQELENVKTL 
KTKLERRKKASAWERNLVYPAVMVLLLIETSISVLLVACNILCLLVDETAM 

30 PKGTRGPGIGNASLSTFGFVGAALEIILIFYLMVSSVVGFYSLRFFGNFTPKKD 
DTTMTKIIGNCVSILVLSSALPVMSRTLGITRFDLLGDFGRFNWLGNFYIVLS 
YNLLFAIVTTLCLVRKFTSAVREELFKALGLHKLHLPNTSRDSETAKPSVNGH 
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QKAL (SEQ ID NO: 273). Polynucleotides encoding these polypeptides are also 
provided. 

This gene is expressed primarily in fetal heart, and to a lesser extent, in colon 
and the adult pulmonary system. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, heart, lung and digestive disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 

10 probes for differential identification of the lissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the cardiovascular* pulmonary 
and digestive systems, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., developmental, cardiovascular, 
or cancerous and wounded tissues) or bodily fluids (e.g., lymph, amniotic fluid, 

15 serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression leveU i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 

20 epitopes shown in SEQ ID NO: 132 as residues: Glu-67 to Asn-74, Glu-88 to Asn-93, 
Lys-95 to Ala-107. Ala- 147 to Arg-153, Phe-197 to Thr-205. Pro-292 to His-308. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in fetal heart, colon and pulmonary tissues and the 
biological activity in increasing the permeability of the plasma membrane of the 

25 myeloid leukemia cell line AML'-193 to calcium, likely indicating that the product of 
this gene is involved in a signal transduction pathway that is initiated when the 
product binds a receptor on the surface of the plasma membrane of myeloid leukemia 
cells, indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for the treatment, prevention, and/or detection of a range of disorders including 

30 a variety of vascular disorders and conditions, which include, but are not limited to 
miscrovascular disease, vascular leak syndrome, aneurysm, stroke, embolism, 
myocardial infarction, myocarditis, ischemia, thrombosis, coronary artery disease. 
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arteriosclerosis, and/or atherosclerosis: pulmonary edema and embolism, bronchitis 
and/or cystic fibrosis; Crohn's Disease and/or colon cancer. Similarly, the tissue 
distribution indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for the diagnosis and treatment of cancer and other proliferative 
5 disorders. Expression within embryonic tissue and other cellular sources marked by 
proliferating cells indicates that this protein may play a role in the regulation of 
cellular division. Similarly, embryonic development also involves decisions involving 
cell differentiation and/or apopiosis in pattern formation. Thus this protein may also 
be involved in apoptosis or tissue differentiation and could again be useful in cancer 

10 therapy. Furthermore, the protein may also be used to determine biological activity, to 
raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify 
agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

15 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 19 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

20 cumbersome. Accordingly, preferably excluded from the present invention arc one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2004 of SEQ ID NO: 19. b is an 
integer of 15 to 2018, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 19, and where b is greater than or equal to a + 14. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 

The protein product of this clone shares sequence homology with the human 
MaxiK channel beta 2 subunit ( See. Genbank Accession No. 
gblAAD23380.1IAF099137_l (AF099I37): all references available through this 
30 accession are hereby incorporated herein by reference), which is believed to be a 
modulatory subunit of the voltage and Ca2+ activated K-h (MaxiK) channel. 
Additionally, this protein shares homology to the human calcium-activated potassium 
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channel beta subunit, which, when combined with its corresponding alpha subunit and 
modulating peptide, are believed to be useful in treating asthma, angina, hypercension. 
incontinence, migraine, irritable bowel syndrome (IBS). The subsequent 
heteromultimer that forms upon combining the alpha, beta, and modulator subunits 
are also thought to be useful in preventing premature labour, preventing and treating 
cerebral ischemia, inducing pain modulation and decreasing neurogenic inflammation 
in a patient (See GeneSeq Accession No. R85306). 

Preferred polypeptides comprise the soluble domain which consists of the 
following amino acid sequence: RSYMQSVWTEESQCTLLNASITETFNCSFSCGP 
DCWKLSQYPCLQVYVNTTSSGEiaLLYHTEETIKIjNQKCSYIPKCGKNFEESM 
SLVNVVMENFRKYQHFSCYSDPEGNQKSVILTKLYSSNVLFHSLFWPTCMMA 
GGVAIVAMVKLTQYLSLLCERIQRINR (SEQ ID NO: 274). Polynucleotides 
encoding these polypeptides are also provided. Based on the sequence similarity, the 
translation product of this gene is expected to share at least some biological activities 
with modulatory subunits of voltage and Ca2+ activated K+ channel proteins. Such 
activities are known in the an. some of which are described in Wallner, et al., PNAS 
96:4137-4142 (1999), incorporated herein by reference. 

This gene is expressed primarily in adrenal gland tumor, and to a lesser extent, 
in Hodgkin's lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typeus) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, endocrine and immune disorders, particularly Hodgkin's Lymphoma. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and/or endocrine systems, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g.. immune, 
endocrine, or cancerous and wounded tissues) or bodily fluids (e.g.. lymph, serum, 
plasma, urine, synovial tluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
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level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
-epitopes shown in SEQ ID NO: 133 as residues: Trp-25 to Gln-30, Pro-50 to Gln-57, 
Pro-93 to Glu- 1 0 1 . Arg- 1 1 4 to Cys- 121, Ser- 1 23 to Gin- 1 29, He- 1 77 to Arg- 1 82. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in adrenal gland tumor and it's identification as the 
modulatory subunit of the voltage and Ca2+ activated K+ (MaxiK) channel indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of various endocrine disorders and cancers, 
particularly Addison's Disease. Cushing's Syndrome, and disorders and/or cancers of 
the pancreas (e.g.. diabetes mellitus), adrenal conex, ovaries, pituitary (e.g., hyper-, 
hypopituitarism), thyroid (e.g.. hyper-, hypothyroidism), parathyroid (e.g., hyper- 
.hypoparathyroidism), hypothallamus, and testes. Alternatively, expression in 
proliferative immune tissues combined with its homology to a novel human K 
channel indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the diagnosis and treatment of a variety of immune system disorders. 
Representative uses are described in the "Immune Activity" and "infectious disease" 
sections below, in Example 11. 13, 14. 16, 18, 19. 20, and 27, and elsewhere herein. 
Briefly, the expression of this gene product in Hodgkin's lymphoma indicates a role in 
regulating the proliferation: survival: differentiation: and/or activation of 
hematopoietic cell lineages, including blood stem cells. This gene product is involved 
in the regulation of cytokine production, antigen presentation, or other processes 
suggesting a usefulness in the treatment of cancer (e.g., by boosting immune 
responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also used as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou s Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-vensus-graft and graft-versus-hosi disea.ses. or autoimmunity 
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disorders, such as autoimmune infenility. lense tissue injury, demyelination. systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood ceils, or that 
5 recruits hematopoietic cells to sites of injury. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. Furthermore, the protein may also be used to determine biological activity, 
raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify 

10 agents that modulate their interactions, in addition to its use as a nutritional 

supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

15 related to SEQ ID NO:20 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever\' related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general- 

20 formula of a-b. where a is any integer between 1 to 2084 of SEQ ID NO:20. b is an 
integer of 15 to 2098, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:20. and where b is greater than or equal to a.+ 14, 

FEATURES OF PROTEIN ENCODED BV GENE NO: 11 
25 The translation product of this gene shares homology with collagen and 

collagen like proteins (See. e.g., Genbank .Accession .Nos. 

gil2920535lgblAAC39658.ll (AF01808I) and gil2384942lgbl A AB69961.il 

(AF022985); all references available through these accession numbers are hereby 
- incorporated by reference herein). Additionally, it has been determined that this gene 
30 has homology to the human Kruppel related zinc finger protein (HTFIO) which is 

known to be important as a transcription factor, particularly in development (See 

Genebank Accession No.Ll 1672). 
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In a specific embodiment, polypeptides comprising the amino acid sequence 
of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: AFAHLQLGPMWKLWRAEEGAAALGGALFLLL 
FALGVRQLLKQRRPMGFPPGPPGLPFIGNIYSLAASSELPHVYMRKQSQVYG 
EVQPRRAPGREGRQAGPGWPGPSWLDLWPPLGRLVGTSPCAGCPLRDTRFPG 
LEGRS PRRRAPLQGEPRPCR (SEQ ID NO: 275), Polynucleotides encoding these 
polypeptides are also provided. 

This gene is expressed primarily in human erythroleukemia cell line (HEL), 
serum induced smooth muscle, and to a lesser extent in human 8 week whole embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuefs) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, leukemia, musculoskeletal, or developmental disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type{s). For 
a number of disorders of the above tissues or cells, particularly of the hematopoietic 
system and muscular system, expression of this gene at significantly higher or lower 
levels is routinely detected in certain tissues or cell types (e.g., immune, 
musculoskeletal, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
amniotic fluid, serum, plasma, urine, synovial fiuid and spinal fluid) or another tissue 
or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 134 as residues: Leu-30 to Gly-38, Arg-67 to VaI-72, 
Val-76 to Ala-89. Pro- 1 18 to Arg-123, Gly-129 to Ala- 136, Leu- 138 to Arg-146. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in human erythroleukemia cell line (HEL), and serum 
induced smooth muscle, and the shared homology with collagen and collagen like 
proteins indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for disorders of hematopoietic or muscular systems, such as leukemia and 
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muscular dystrophy. Additionally, the shared homology with collagen proteins would 
suggest that this protein may also be important in the diagnosis or treatment of 
various autoimmune disorders (i.e., rheumatoid arthritis, lupus, scleroderma, 
dermatomyositis. etc.), dwarfism, spinal deformation, joint abnormalities, and 
chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita, familial 
osteoarthritis, Atelosieogenesis type II, metaphyseal chondrodysplasia type Schmid, 
etc.). 

The secreted protein can also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions and as nutritional supplements. It may also have a 
very wide range of biological activities although no evidence for any is provided in 
the specification. Typical of these are cytokine, cell proliferation/differentiation 
modulating activity or induction of other cytokines: 

immunostimulating/immunosuppressant activities (e.g., for treating human 
immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of haematopoiesis (e.g., for treating anaemia or as adjunct to 
chemotherapy); stimulation of growth of bone, cartilage, tendons, ligaments and/or 
nerves (e.g., for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility): chemotactic and chemokinetic activities (e.g.. for treating 
infections, tumours); haemostatic or thrombolytic activity (e.g., for treating 
haemophilia, cardiac infarction etc.): anti-inflammatory activity (e.g., for treating 
septic shock, Crohn's Disease): as antimicrobials: for treating psoriasis or other 
hyperproliferative disease: for regulation of metabolism, behaviour, and many others. 
Also contemplated is the use of the corresponding nucleic acid in gene therapy 
procedures. Furthermore, the protein may also be used to determine biological 
activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, 
to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:21 and may have been publicly available prior to conception of 
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the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
5 formula of a-b, where a is any integer between I to 1732 of SEQ ID NO:21. b is an 
integer of 15 to 1746, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:2 1 , and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 
10 A preferred polypeptide fragment of the invention comprises the following 

amino acid sequence: iVlRVRIGLTLLLCAVLLSLASASSDEEGSQD 
ESLGFODYFDIR (SEQ ID NO: 276). Polynucleotides encoding these polypeptides 
are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
15 8. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 8. 

This gene is expressed primarily in dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuefs) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue( s) or cell lype(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 

25 significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g.. immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 135 as residues: Ser-22 to Ser-41. Glu-43 toThr-50, 
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Ser-63 to Leu-68, Ser-Tl to Gly-84, Ser-96 to Gly-1 14. Polynucleotides encoding said 
polypeptides are also provided. 

The tissue distribution in dendritic cells indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the diagnosis, prevention, 
5 and/or treatment of a variety of immune system disorders. Representative uses are 
described in the "Immune Activity" and "infectious disease" sections below, in 
Example 11, 13, 14, 16, 18, 19, 20, and 27, and eilsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation; 
survival; differentiation; and/or activation of hematopoietic cell lineages, including 

10 blood stem cells. This gene product is involved in the regulation of cytokine 

production, antigen presentation, or other processes suggesting a usefulness in the 
treatment of cancer (e.g., by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene . 
product is involved in immune functions. Therefore it is also used as an agent for 

15 immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 

20 disorders, .such as autoimmune infertility, lense tissue injury, demyelination. systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood cells, or that 
recruits hematopoietic cells to sites of injur>'. In addition, this gene product may have 

25 commercial utility in the expansion of stem cells and committed progenitors of 

various blood lineages, and in the differentiation and/or proliferation of various cell 
types. 

The secreted protein can be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
30 that modulate their interactions and as nutritional supplements. It may also have a 
very wide range of biological activities although no evidence for any is provided in 
the specification. Typical of these are cytokine, cell proliferation/differentiation 
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modulating activity or induction of other cytokines: 
immunostimulating/immunosuppressant activities (e.g.. for treating human 
immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of haematopoiesis (e.g., for treating anaemia or as adjunct to 
5 chemotherapy): stimulation of growth of bone, cartilage, tendons, ligaments and/or 
nerves (e.g., for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility); chemotactic and chemokinetic activities (e.g., for treating 
infections, tumours): haemostatic or thrombolytic activity (e.g., for treating 
haemophilia, cardiac infarction etc.): anti-inflammatory activity (e.g., for treating 

10 septic shock, Crohn's Disease): as antimicrobials: for treating psoriasis or other 

hyperproliferative disease: for regulation of metabolism, behaviour, and many others. 
Also contemplated is the use of the corresponding nucleic acid in gene therapy 
procedures. Furthermore, the protein may also be used to determine biological 
activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 

15 identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

20 related to SEQ ID NO:22 and may have been publicly available prior to conception of 
the pre.sent invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

25 formula of a-b, where a is any integer between 1 to 2862 of SEQ ID NO:22, b is an 
integer of 15 to 2876, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:22. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 13 
30 A preferred polypeptide variant of the invention comprises the following 

amino acid sequence: .VIARGSLRRLLRLLVLGLWLALLRSVAGEQAPGTAPC 
SRGSSWSADLDKCMDCSTSCPLPA ALAHPVVGRSEPDLRAGAAFWLFGLE 
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TMPQE REVHHPHRGDRRRGLPSCGADPVTMCPLPAGARPLIIHSSILEPVSAS 
QTRREPSSSNHK GGGGR (SEQ ID NO: 277). Polynucleotides encoding these 
polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
5 16. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 16. 

This gene is expressed primarily in tumor grpwth factor or lipopolysaccharide 
treated bone marrow stroma, epithelioid sarcoma, umbilical vein endothelial cells, and 
to a lesser extent, in other tissues. 
10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tis.sue(s) or cell type( s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, hematopoiesis or inamune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
15 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the hematopoietic, 
integumentary, or immune systems expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., hematopoietic, 
immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, 
20 plasma, urine, .synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy ti.ssue or bodily fluid from an individual not 
having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
25 epitopes shown in SEQ ID NO: 136 as residues: Pro-35 to Trp-42, Pro-65 to Asp-72, 
Thr-86 to Phe-93. Ile-97 to Glu-103. Polynucleotides encoding said polypeptides are 
also provided. 

The tissue distribution in tumor growth factor or lipopolysaccharide treated 
bone marrow stroma, epithelioid sarcoma, and umbilical vein endothelial cells 
30 indicates that polynucleotides and polypeptides corresponding to this gene are u.seful 
for the diagnosis, prevention, and/or treatment of a variety of immune system 
disorders. Representative uses are described in the "Immune Activity" and "infectious 
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' "disease" sections below, in Example 11,13, 14, 16, 18, 19, 20. and 27, and elsewhere 

I herein. Briefly, the expression of this gene product indicates a role in regulating the 

! : proliferation: survival: differentiation: and/or activation of hematopoietic cell 

-lineages, including blood stem cells. This gene product is involved in the regulation of 
j : 5 cytokine production, antigen presentation, or other processes suggesting a usefulness 

in the treatment of cancer (e.g., by boosting immune responses). 
- Since the gene is expressed in cells of lymphoid origin, the natural gene 

: product is involved in immune functions. Therefore it is also used as an agent for 
I immunological disorders including arthritis, asthma, immunodeficiency diseases such 

j 10 as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 

I . bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 

^ such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 

tissues, such as host-versus-graft and grafi-versus-host diseases, or autoimmunity 

disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
15 lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
j Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 

i factor that influences the differentiation or behavior of other blood cells, or that 

I 

i recruits hematopoietic cells to sites of injur\'. In addition, this gene product may have 

commercial utility in the expansion of stem cells and committed progenitors of 
20 various blood lineages, and in the differentiation and/or proliferation of various cell 
types. 

The secreted protein can be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions and as nutritional supplements. It may also have a 

25 very wide range of biological activities although no evidence for any is provided in 
the specification. Typical of these are cytokine, cell proliferation/differentiation 
modulating activity or induction of other cytokines: 
immunostimulating/immunosuppressant activities (e.g., for treating human 
' inamunodeficiency virus infection, cancer, autoimmune diseases and allergy); 

30 regulation of haemaiopoiesis (e.g., for treating anaemia or as adjunct to 

chemotherapy); stimulation of growth of bone, cartilage, tendons, ligaments and/or 
nerves (e.g., for treating wounds, stimulation of follicle stimulating hormone (for 

i 

I 
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control of fertility); chemotactic and chemokinetic activities (e.g., for treating 
infections, tumours): haemostatic or thrombolytic activity (e.g., for treating 
haemophilia, cardiac infarction etc.); anti-inflammatory activity (e.g., for treating 
septic shock, Crohn's Disease ): as antimicrobials: for treating psoriasis or other 
5 hyperproliferative disease: for regulation of metabolism, behaviour, and many others. 
Also contemplated is the use of the corresponding nucleic acid in gene therapy 
procedures. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
.Many polynucleotide sequences, such as EST sequences, are publicly 

10 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:23 and may have been publicly available prior to conception of 
the present mvention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

15 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 1038 of SEQ ID NO:23. b is an 
integer of 15 to 1052, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:23, and where b is greater than or equal to a + 14. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 14 

The translation product of this gene shares sequence homology with 
chromaffin granule amine transporter protein which is thought to be imponant in 
vesicle membrane amine transport, particularly in the neural and endocrine tissue, and 
the human vesicular monoamine transporter hVMATI which is involved in the 

25 regulation of amine storage in cardiovascular, endocrine, and central nervous system 
function (See, Genbank Accession Nos. gill 3 14290 and gblAAC50472.1l; all 
references available through these accession numbers are hereby incorporated by 
reference herein). Based on these sequence similarities. The translation product of 
this gene is expected to share at least some biological activities with amine transporter 

30 proteins. Such activities are known in the art» some of which are described in 
Erickson^etal.. PNAS 93:5166-5171 (1996). and/or Liu. etal.. Cell 70:539-551 
(1992), which are both incorporated herein by reference. 
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In a specific embodiment, polypeptides comprising the amino acid sequence 
of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: GTSFLDPTLSLFVLEKFNLPAGYVGLVFLGMAL 
5 SYAISSPLFGLLSDKRPPLRKWLLVFGNLITAGCYMLLGPVPILHIKSQLWLL 
VLILVVSGLSAGMSIIPTFPEILSCAHENGFEEGLSTLGLVSGLFSAMWSIGAF 
MGPTLGGFLYEFaGFEWAAAlQGLWALISGLAMGLFYLLEYSRRKRSKSQNIL 
STEEERTTLLPNET (SEQ ID NO: 278). Polynucleotides encoding these 
polypeptides are also provided. 

10 The gene encoding the disclosed cDNA is believed to reside on chromosome 

6. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 6. 

This gene is expressed primarily in colon cancer, osteoclastoma, andT-cell 
lymphoma, and to a lesser extent in many tumor or proliferative tissues such as 

15 endometrial tumor, chondrosarcoma, induced umbilical vein endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s ) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, diseases resulting from disorders in small molecule transport (i.e., 

20 signalling molecules) in afflicted tissues and organs, particularly of the endocrine and 
central nervous systems. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the musculoskeletal, immune, and/or digestive systems 

25 and cancer expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g., neural, endocrine, or cancerous and 
wounded tissues) or bodily fluids (e.g.. bile, lymph, serum, plasma, urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression 

30 level in healthy tissue or bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 137 as residues: Ser-1 14 to Asn-123. Thr-127 toThr- 
132. Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in colon cancer, osteoclastoma, and T-cell lymphoma 
5 and homology to amine transporter family members indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
disorders or diseases resulted from small molecule transport in afflicted tissues and 
organs, particularly that of colon, osteoclast or T-cells. The expression in cancer 
tissues, and shared homology with transporter proteins may also indicate its role in 

10 anti-cancer drug resistance. Additionaly, the protein can be used to determine 

biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands, to 
identify agents that modulate their interactions and as nutritional supplements. It may 
also have a very wide range of biological activities although no evidence, for any is 
provided in the specification. Typical of these are cyiolcine. cell 

15 proliferation/differentiation modulating activity or induction of other cytokines;, 
immunostimulating/immunosuppressant activities (e.g. for treating human 
immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of haematopoiesis (e.g. for treating anaemia or as adjunct to 
chemotherapy); stimulation of growth of bone, cartilage, tendons, ligaments and/or 

20 nerves (e.g. for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility): chemotaciic and chemokinetic activities (e.g. for treating 
infections, tumours); haemostatic or thrombolytic activity (e.g. for treating 
haemophilia, cardiac infarction etc.): anti-inflammatory activity (e.g. for treating 
septic shock, Crohn's Disease): as antimicrobials; for treating psoriasis or other 

25 hyperproliferative disease; or for identifying inhibitors or promoters of the transport 
of toxic molecules to vesicles, for regulation of metabolism, behaviour, and many 
others. Also contemplated is the use of the corresponding nucleic acid in gene therapy 
procedures. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

30 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:24 and may have been publicly available prior to conception of 
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the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
5 formula of a-b, where a is any integer between 1 to 1527 of SEQ ID NO:24, b is an 
integer of 15 to 1541, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:24, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

10 The translation product of this gene shares sequence homology with the 

human prolyl 4-hydroxylase alpha (II) subunit which is important in catalyzing the 
formation of 4-hydroxyproline in collagens which is essential for the folding of newly 
synthesised collagen polypeptide chains into triple-helical molecules (See Genbank 
Accession No. gblAAB7I339. II: all references available through this accession are 

15 hereby incorporated herein by reference). Based on the sequence similarity, the 

translation product of this gene is expected to share at least some biological activities 
with Prolyl 4-hydroxylase proteins. Such activities are known in the an, some of 
which are described in Annunen. et al.. J. Biol. Chem. 272:17342-17348 ( 1997) 
which is incorporated herein by reference. 

20 • When tested against U937 myeloid and Jurkat T-cell cell lines, supematanis 
removed from cells containing this gene activated the gamma activating sequence 
(GAS), a promoter element found upstream of many genes which are involved in the 
Jak-STAT pathway. The Jak-STAT pathway is a large, signal transduction pathway 
involved in the differentiation and proliferation of cells. Therefore, activation of the 

25 Jak-STAT pathway, reflected by the binding of the GAS element, can be used to 
indicate proteins involved in the proliferation and differentiation of cells. Thus, it is 
likely that this gene activates myeloid cells and T-cells through the Jak-STAT signal 
transduction pathway. 

In a specific embodiment, polypeptides comprising liie amino acid sequence 

30 of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: 
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GTREARLRDLTRFYDKVLSLHEDSTTPVANPLLAFTLIKRLQSDWRNVVHSL 
EASENIRALKDGYEKVEQDLPAFEDLEGAARALMRLQDVYMLNVKGLAR 
GVFQRVTGSAITDLYSPKRLFSLTGDDCFQVGKVAYDMGDYYHAIPWLEEA 
VSLFRGSYGEWKTEDEASLEDALDHLAFAYFRAGNVSCALSLSREFLLYSPD 
5 NKRMARNVLKYERLLAESPNHVVAEAVIQRPNIPHLQTRDTYEGLCQTL 
GSQPTLYQIPSLYCSYETNSNAYLLLQPIRKEVIHLEPYIALYHDFVSDSEAQ 
KIRELAEPWLQRSVVASGEKQLQVEYRISKSAWLKDTVDLKLVTLNHRIAA 
LTGLDVRPPYAEYLQVVNYGIGGHYEPHFDHATSPSSPLYRMKSGNRVATFM 
lYLSSVEAGGATAFIYANLSVPVVRNAALFWWNLHRSGEGDSDTLHAGCP 

10 VLVGDKWVANKWIHEYGQEFRRPCSSSPED (SEQ ID NO: 282). Additional. 

Preferred polypeptides comprise the following amino acid sequence: GTREA 
RLRDLTRFYDKVLSLHEDSTTPVANPLLAFTLIKRLQSDWRNVVHSLEASENI 
RALKDGYEKVEQ DLPAFEDLEGAARAL (SEQ ID NO: 279), ALMRLQD (SEQ 
ID NO: 280), and/or VEAGG AT (SEQ ID NO: 28 1 ). Polynucleotides 

15 encoding these polypeptides are also provided. 

This gene is expressed primarily in lymph node breast cancer, colon 
carcinoma, and to a lesser extent in osteoblasts and adipocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. disorders of connective and immune tissues, particularly autoimmune 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differentia! identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

25 particularly of the connective tissues in breast, colon, bone, and fat. expression of this 
gene at significantly higher or lower levels is routinely detected in certain tissues or 
cell types (e.g., immune, connective, or cancerous and wounded tissues) or bodily 
fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

30 standard gene expression level, i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 138 as residues: Ser-74 to Ala-84. Gin- 1 56 to Tyr- 
161, Tyr-184 to Asn-189. Ser-218.to ne-223, Pro-299 to Ser-308. His-359 to Thr-368, 
.Tyr-390 to Asp-404. Polynucleotides encoding said polypeptides are also provided. 
5 The tissue distribution in lymph node breast cancer and colon carcinoma and 

homology to prolyl 4-hydroxylase alpha (II) subunit indicates that polynucleotides 
^^and polypeptides corresponding to this gene are useful for intervention of connective 
■ tissue disorders and diseases (e.g. arthritis, trauma, tendonitis* chrondomalacia and 
inflammation), as well as, in the diagnosis or treatment of various autoimmune 
10 - disorders such as rheumatoid arthritis, lupus, scleroderma, and dermatomyositis as 
- well as dwarfism, spinal deformation, andspecific joint abnormalities as well as 
chondrodysplasias ie. spondyloepiphyseal dysplasia congenita, familial osteoarthritis, 
Atelosteogenesis type IL metaphyseal chondrodysplasia type Schmid. Alternatively, 
the tissue distribution within various tissue carcinomas and tumor tissues, and 
15 biological activity reflected by the binding and activation of the GAS promoter 

element indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the diagnosis and treatment of cancer and other proliferative disorders. 

Expression in cellular sources marked by proliferating cells indicates that this 
:protein may play a role in the regulation of cellular division. Similarly, embryonic 
20 "development also involves decisions involving cell differentiation and/or apoptosis in 
pattern formation. Thus this protein may also be involved in apoptosis or tissue 
differentiation and could again be useful in cancer therapy. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 
25 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these .sequences are 
related to SEQ ID NO:25 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever>' related sequence is 
30 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2065 of SEQ ID NO:25, b is an 
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integer of 15 to 2079, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:25, and where b is greater than or equal to a + 14. . 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

In an additional embodiment, polypeptides comprising the amino acid 
sequence of the open reading frame upstream of the predicted signal peptide are 
contemplated by the present invention. Specifically, polypeptides of the invention 
comprise the following amino acid sequence: IQPSHAALLHCRSTFRKTECLDPW 
WVRRQLLGiMAGIGGLQKMKAPHTGVLHLGSVWVFLGPFLLGVGYTLTFNPL 
SGCMSTVRWLNSNITANRTLSRSVCHVTPLHRSLSPHDGEYLRQMLLNSSSR 
AGEAGSWGY (SEQ ID NO: 283). Polynucleotides encoding these polypeptides are 
also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
20. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 20. 

This gene is expressed primarily in fetal liver, and. to a lesser extent, in a 
variety of fetal and other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, liver disorders and cancers (e.g., hepatoblastoma, hepatitis, liver 
metabolic diseases and conditions that are attributable to the differentiation of 
hepatocyte progenitor cells). Similarly, polypeptides and antibodies directed to the.se 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the liven expression of this gene at significantly higher 
or lower levels is routinely detected in certain tissues or cell types (e.g., hepatic, or 
cancerous and wounded tissues) or bodily fluids (e.g.. lymph, . bile, serum, plasma, 
urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 139 as residues: Ser-67 to Tyr-75. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in fetal liver indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for detection and treatment of liver 
disorders and cancers (e.g., hepatoblastoma, jaundice, hepatitis, liver metabolic 
.-diseases and conditions that are attributable to the differentiation of hepatocyte 
progenitor cells). In addition the expression in fetus would suggest a useful role for 
.the protein product in developmental abnormalities, fetal deficiencies, pre-natal 
•disorders and various would-healing models and/or tissue trauma. Furthermore, the 
.protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. . 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:26 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever\' related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1933 of SEQ ID NO:26, b is an 
integer of 15 to 1947, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:26, and where b is greater than or equal to a + 14, 

FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

The translation product of this gene shares sequence homology with human 
laminin Bl which is thousht to be an important structural extracellular matrix 
component involved in cell migration and signalling, paricularly in stimulating 
epithelial cell growth and differentiation (See. Genbank Accession No gill 86837). 
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Preferred polypeptides of the invention comprise the following amino acid 
sequences: CSSPPGRLPWCWTAPRTLGKHGSLISTLRLTAPLHLAWKMMLS 
RKALFVLLNTPVLFHALEGRLFSKLCHHHTIQRTLTVPKFRSS (SEQ ID NO: 
284), RSPTSRVQLLKRQSCPCQRNDLNEEPQHFTHYAIYDFIVKGSCFCNG 
5 HADQCIPVHGFRPVKAPGTFHMVHGKCM (SEQ ID NO: 285), and/or HNTAG 
SHCQHCAPLYNDRPWEAADGKTGAPNECRTCKCNGHADTCHFDVNVWEAS 
GNRSGGVCDDCQHNTEGQYCQRCKPGFYRDLRRPFSAPDACKPCSCHPV 
GSAVLPANSVTFCDPSNGDCPCKPGVAGRRCDRCMVGYWGFGDYGCRP 
CDCAGSCDPITGDCISSHTDIDWYHEVPDFRPVHNKSEPAWEWEDAQGFSAL 

10 LHSGKCECKEQTLGNAK.AFCGMKYSYyLKlKILSAHDKGTHVEVNVKIK 
KVLKSTKLKIFRGKANIISRIMDGQ RMHLSNPQSWFGIPCSRT (SEQ ID NO: 
286). Polynucleotides encoding these polypeptides are also provided. 

Included in this invention as preferred domains are Laminin-type EGF-like 
(LE) domain signatures, which were identified using the ProSite analysis tool (Swiss 

15 Institute of Bioinformatics). Laminins are the major noncollagenous components of 
basement membranes that mediate cell adhesion, growth migration, and 
differentiation. They are composed of distinct but related alpha, beta and gamma 
chains. The three chains form a cross-shaped molecule that consist of a long arm and 
three short globular arms. The long arm consists of a coiled coil structure contributed 

20 by all three chains and cross-linked by interchain disulfide bonds. Beside different 
types of globular domains each subunit contains, in its first half, consecutive repeats 
of about 60 amino acids in length that include eight conserved cysteines. The tertiary 
structure of this domain is remotely similar in its N-terminal to that of the EGF-like 
module. It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of 

25 copies of the LE domain in the different forms of laminins is highly variable: from 3 
up to 22 copies have been found. A schematic representation of the topology of the 
four disulfide bonds in the LE domain is shown below. 



30 
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■:*C': conserved cysteine involved in a disulfide bond 
•a': conserved aromatic residue 

'G*: conserved glycine (lower case = less conserved) 
15 's': region similar to the EGF-like domain 
position of the pattern 



In mouse laminin gamma- 1 chain, the seventh LE domain has been shown to 
be the only one that binds with a high affinity to nidogen. The binding-sites are 
20 located on the surface within the loops CI -C3 and C5-C6. Long consecutive arrays of 
LE domains in laminins form rod-like elements of limited flexibility, which determine 
ahe spacing in the formation of laminin networks of basement membranes. We 
derived a signature pattern for the LE domain which covers the C-terminal half of the 
repeat starting with the fourth conserved cysteine. The consensus pattern is as 
25 follows: C-x(K2)-C-x(5)-G-x(2)-C-.x(2)-C.x(3,4)-|n^\V|-x(3.15)-C jAII Cs are 
involved in disulfide bonds] 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: CDDCQHNTEGQYCQRCKPGFYRDLRRPFSAPDACKPC (SEQ ID 
NO: 287) and/or CPCKPGVAGRRCDRCMVGYWGFGDYGCRPCDCAGSC (SEQ 
30 ID NO: 288). Polynucleotides encoding these polypeptides are also provided. 

Further preferred are polypeptides comprising the laminin-type EGF-like 
domains listed above, and at least 5. 10. 15, 20. 25, 30. 50, or 75 additional 
contiguous amino acid residues of the sequence encoded by this gene. The additional 
contiguous amino acid residues is N-terminal or C- terminal to the laminin-type EGF- 
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like domain. Alternatively, the additional contiguous amino acid residues is both N- 
terminal and C-terminal to the lamihin-type EGF-likc domain, wherein the total N- 
and C-terminal contiguous amino acid residues equal the specified number. The 
above preferred polypeptide domain is characteristic of a signature specific to 
5 Laminin proteins. Based on the sequence similarity, the translation product of this 
gene is expected to share at least some biological activities with Laminin proteins. 
Such activities are known in the art. some of which are described elsewhere herein. 

This gene is expressed primarily in osteoblastic tissues and cell types, 
including osteoblasts, osteoblastomas and osteoclastomas. Expression is also 

10 abundant in vascular-pulmonary tissues such as lung, micro-vasculature. pulmonary, 
endoithelial and smooth muscle cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue{s.) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

15 not limited to, cancer and malignancies (particularly of osteoblastic tissues and 

rhabdomyosarcoma), as well as cardiovascular and respiratory or pulmonary disorders 
such as athsma. pulmonary edema, pneumonia, atherosclerosis, restenosis, stoke, 
thrombosis hypertension, inflammation and wound healing. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 

20 probes for differential identification of the tissue(si or cell type(s). For a number of 
disorders of the above tissues or cells, panicuiarly of the cardio-respiratory system, 
and skeletal system expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g.. skeletal, osteoblast, cardio- 
respiratory. va.scular, or cancerous and wounded tissues) or bodily fluids (e.g., lymph. 

25 pulmonary surfactant, serum, plasma, urine, synovial fiuid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

-Preferred polypeptides of the present invention comprise immunogenic 

30 epitopes shown in SEQ ID NO: 140 as residues: Scr-28 to Cys-34. Thr-5 1 to Thr-58, 
Tyr-64 to Asn-81, Asp-1 1 1 to Lys-1 16. Asp- 145 to Phe-160, Pro-203 to Glu-217. 
Polynucleotides encoding said polypeptides are aLso provided. 
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The tissue distribution in osteoblastic tissues and cell types and homology to 
laminin indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for the treatment, prevention and diagnosis of cardiovasular and respiratory or 
pulmonary disorders such as asthma, pulmonary edema, pneumonia, atherosclerosis, 
restenosis, stoke, angina, thrombosis hypertension, inflammation and wound healing. 
As a homolog of laminin. this gene product quite possibly has a role in cell adhesion, 
migration, proliferation, angiogenesis, chondrogenesiis, wound healing and 
oncogenesis. An EST (Int J Cancer 1996 xVIay l6;66(4):571-577) with an identical 
sequence to part of this contig was shown to be differentially expressed in human 
primary myoblasts and embr\'onal rhabdomyosarcoma and therefore might have an 
important role in the determination or maintenance of the normal phenotype, and thus 
its loss IS possibly involved in the progression of malignancies, particularly of skeletal 
muscle. Similarly, the homology to a laminin would suggest a role in the detection 
and treatment of disorders and conditions afflicting connective tissues (e.g. arthritis, 
trauma, tendonitis, chrondomalacia and inflammation) in the diagnosis or treatment of 
various autoimmune disorders such as rheumatoid anhritis, lupus, scleroderma, and 
dermatomyositis as well as dwarfism, spinal deformation, and specific joint 
abnormalities as well as chondrodysplasias i.e. spondyloepiphyseal dysplasia 
congenita, familial osteoarthritis, Atelosieogenesis type 11. metaphyseal 
chondrodysplasia type Schmid. Furthermore, the protein may also be used to 
determine biological activity, to raise antibodies, as tissue markers, to isolate cognate 
ligands or receptors, to identify agents that modulate their interactions, in addition to 
its use as a nutritional supplement. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the 
above listed tissues. 

Many polynucleotide sequences, such as EST sequences, arc publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:27 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever\' related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between i to 3365 of SEQ ID NO:27. b is an 
integer of 15 to 3379. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:27, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
10. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 10. 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specitlcally, polypeptides of the invention comprise the 
following amino acid sequence: NISSQYCILKSLEMxVIISGLKLLVLFLKFAPENY 
CLSTETLQMPNRHLRLSKATCYLMKCLLPSYFE (SEQ ID NO: 289). 
Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in placenta, brain, and to a lesser extent, in a 
variety of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s ) or cell type( s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, reproductive, behavioral, or nervous system disorders, such as: 
depression, schizophrenia, Alzheimer's Disease. Parkinson's Disease. Huntington's 
Disease, dementia, paranoia, addictive behavior, epilepsy, transmissible spongiform 
encephalopathy (TSE), Creutzfeldt«Jakob disease (CJD). Other diseases and 
conditions related to expression in the placenta might include developmental 
anomalies and fetal deficiencies, ovarian and endometrial cancers, reproductive, 
disfunction and pre-natal disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous and reproductive systems, 
expression of this gene at significantly higher or lower levels is routinely delected in- 
certain tissues or cell types {e.g.. neural, reproductive, or cancerous and wounded 
tissues) or bodily fluids (e.g., lymph, amniotic fluid, serum, plasma, urine, synovial 
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fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 
Preferred polypeptides of the present invention comprise immunogenic 
5 epitopes shown in SEQ ID NO: 141 as residues: Ala- 16 to Leu-22. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in brain indicates that.polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 

10 Representative uses are described in the "Regeneration" and "Hyperproliferative 

Disorders" sections below, in Example 1 1, 15, and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease, Parkinson's Disease. Huntington's Disease. Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 

15 trauma, congenital malformations, spinal cord injuries, ischemia and infarction. • 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities. ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, expression in placenta would suggest a 

20 possible role in the treatment and diagnosis of developmental anomalies and fetal 

deficiencies, ovarian and endometrial cancers, reproductive disfunction and pre-natal 
disorders. 

Similarly, expression within embryonic ti.ssue and other cellular sources 
marked by proliferating cells indicates that this protein may play a role in the 

25 regulation of cellular division. Sinnilarly, embryonic development also involves 

decisions involving cell differentiation and/or apoptosis in pattern formation. Thus, 
this protein may also be involved in apoptosis or tissue differentiation and could again 
be useful in cancer therapy. Furthermore, the protein may also be used to determine 
biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 

30 receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as, antibodies directed against the protein may 
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show utility as a lumor marker and/or immunotherapy targets for the above listed 
tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID.NO:28 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1992 of SEQ ID NO:28, b is an 
integer of 15 to 2006, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:28, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

The translation product of this gene shares sequence homology with the 
murine transforming protein (See, e.g., Genbank Accession No. 
gil53529lemblC AA36859.il; all references available through this accession are hereby 
incorporated by reference herein). 

In a specific embodiment, polypeptides comprising the amino acid sequence 
of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: PIEGTPAGTGPEFPGRPTRPQRMRSLISSHPCQ 
HLLLLLLLLFLILAILVDVKWYLVLFICISLMTSDVEHLFMCLLAIRISSWR 
NVY (SEQ ID NO: 290). Polynucleotides encoding these polypeptides are also 
provided. 

This gene is expressed primarily in activated and basal T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. immunodeficiency, tumor necrosis, infection, lymphomas, auto- 
immunities, cancer, metastasis, wound healing, inflammation, anemias (leukemia) and 
other hematopoietic disorders. Similarly, polypeptides and antibodies directed to these 
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polypeptides are useful in providing immunological probes for differential 
identification of the tissueis) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significandy higher or lower levels is routinely detected in certain tissues or cell types 
5 (e.g., immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 

serum, plasma, urine, synovial tluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

10 The tissue distribution in activated T-cells and the homology to a murine 

transforming protein indicates polynucleotides and polypeptides corresponding to this 
gene are useful for the diagnosis and treatment of a variety of immune system 
disorders. Representative uses are described in the "Immune Activity" and "infectious 
disease" sections below, in Example 11. 13, 14. 16. 18, 19. 20. and 27, and elsewhere 

15 herein. Briefly, the expression of this gene product indicates a role in regulating the 
proliferation; survival; differentiation; and/or activation of hematopoietic cell 
lineages, including blood stem cells. This gene product is involved in the regulation of 
cytokine production, antigen presentation, or other processes suggesting a usefulness 
in the treatment of cancer (e.g., by boosting immune responses). 

20 '^^ Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also used as an agent for 
immunological di,sorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disea.se, .sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 

25 such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination. systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid" arthritis. Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 

30 factor that influences the differentiation or behavior of other blood cells, or that 

recruits hematopoietic cells to sites of injury. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
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various blood lineages, and in the differentiation and/or proliferation of various cell 
types. 

The secreted protein can also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate igands or receptors, to identify agents 
5 that modulate their interactions and as nutritional supplements. It may also have a 
very wide range of biological activities although no evidence for any is provided in 
the specification. Typical of these are cytokine, cell proliferation/differentiation 
modulating activity or induction of other cytokines; 
immunostimulating/immunosuppressant activities (e.g. for treating human 

10 immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of haematopoiesis (e.g. for treating anaemia or as adjunct to 
chemotherapy); stimulation of growth of bone, caailage. tendons, ligaments and/or 
nerves (e.g. for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility); chemotactic and chemokinetic activities (e.g. for treating 

15 infections, tumours); haemostatic or thrombolytic activity (e.g. for treating 

haemophilia, cardiac infarction etc.): anti-inflammatory activity (e.g. for treating 
septic shock, Crohn's Disease): as antimicrobials: for treating psoriasis or other 
hyperproliferative disease: for regulation of metabolism, behaviour, and many others. 
Also contemplated is the u.se of the corresponding nucleic acid in gene therapy 

20 procedures. Furthermore, the protein may also be used to determine biological 

activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

25 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:29 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

30 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by. the general 
formula of a-b, where a is any integer between 1 to 3056 of SEQ ID NO:29. b is an 
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integer of 15 to 3070, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:29, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 20 
5 The translation product of this gene was shown to have homology to the Mus 

musculus ALG-2 protein, which is known to code for a Ca(2+)-binding protein 
required for T cell receptor-. Fas-, and glucocorticoid-induced cell death. ALG-2 
mediate Ca(2+)-regulated signals along the death pathway and may play a role in the 
onset of Alzheimer's Disease (See e.g., Genbank Accession No.gill213520; all 
10 references available through this accession are hereby incorporated by reference 
herein). 

Preferred polypeptides comprise the following amino acid sequence: NWVPT 
CLCPSAPCSFHLLSRFKCLFSPQRLTDIFRRYDTDQDGWIQVSYEQYLSMVFS 
IV (SEQ ID NO: 291), and/or QRLTDIFRRYDTDQDGWIQVSYEQYLSMVFSIV 

15 (SEQ ID NO: 292). Polynucleotides encoding these polypeptides are also provided. 
When tested against K562 cell lines, supematants removed from cells 
containing this gene activated the ISRE (interferon-sensitive responsive element). 
Thus, it is likely that this gene activates immune or leukemia cells through the Jaks- 
STAT signal transduction pathway. ISRE is a promoter element found upstream in 

20 many genes which are involved in the Jaks-STAT pathway. The Jaks-STAT pathway 
is a large, signal transduction pathway involved in the differentiation and proliferation 
of cells. Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the ISRE element, can be used to indicate proteins involved in the proliferation and 
differentiation of cells. 

25 A preferred polypeptide fragment of the invention comprises the following 

amino acid sequence: MFYKLTLILCELSVAGVTQAASQRPLQRLPRHICSQR 
XPPGRCLLKAXLQTTWXXPDKPI PRLSPPLXSDPKR (SEQ ID NO: 293). 
Polynucleotides encoding these polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 

30 5. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromo.some 5. 
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This gene is expressed primarily in placenta, and to a lesser extent, in a variety 
of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental anomalies, fetal deficiencies ovarian and endometrial 
cancers, reproductive dysfunction and pre-natal disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 

10 disorders of the above tissues or cells, particularly of the reproductive system, 

expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., reproductive, developmental, or cancerous and 
wounded tissues) or bodily fluids (e.g., lymph, amniotic fluid, serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 

15 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 143 as residues: Arg-'24 to Arg-31, Ile-33 to Gly-4L 

20 Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in placenta indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment, prevention 
and/or diagnosis of developmental anomalies, fetal deficiencies, ovarian and 
endometrial cancers, reproductive dysfunction and pre-natal disorders. Expression 

25 within embryonic tissue and other cellular sources marked by proliferating cells 
combined with the observed ISRE activity, and homology to the apopiosis linked, 
ALG-2 indicates that this protein may play a role in the regulation of cellular division, 
and may show utility in the diagnosis, treatment, and/or prevention of developmental 
diseases and disorders, cancer, and other proliferative conditions. Representative uses 

30 are described in the "Hyperproliferative Disorders" and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental tissues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation. 
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Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:30 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 22 1 3 of SEQ ID NO:30, b is an 
integer of 15 to 2227, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:30, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

The translation product of this gene was shown to have homology to the 
human histo-blood group A transferase (See. e.g., Genbank Accession No. 
gblAAD26573.1!AF1344l3.l (AFI34413): all references available through this 
accession are hereby incorporated by reference herein) which is known to represent 
one of the major allogeneic antigens in both erythrocytes and tissues of humans. Its 
been proposed that the A phenotype is associated with'the glycosyltransferas that 
converts the H substance associated with the O phenotype to A through the addition 
of alpha 1-3-N-acetylgalactosamine or alpha 1-3-galactosYl residues to the H antigen 
Fuc-aiphal-2Gal- betal-R. Therefore, the primar\' product of the histo-blood group A 
is its respective glycosyltransferase. Preferred polypeptides of the invention comprise 
the following amino acid sequence: TSSPVFSFCSMAVREPDHLQ 
RVSLPRYNVSASLQWLPCHRIVLQPWHMCAMWELGQVLFHPVAPREGAAFS 
PVSTLTWPSSCSHSESTMELELQF (SEQ ID NO: 294). LPCHRIV (SEQ ID NO: 
296), SLQWLPCHRIVLQPW (SEQ ID NO: 297), and/or MAVREPDHLQRVSLPR 
(SEQ ID NO: 295). Polynucleotides encoding these polypeptides are also provided. 
This gene is expressed primarily in 12-week-oId human embryo. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions w hich include, but are 
not limited to, developmental anomalies, fetal deficiencies, pre-natal disorders, 
hematopoietic disorders, or cancer. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the developing fetus, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., hematopoietic, lymph, developing, or cancerous and wounded tissues) or bodily 
fluids (e.g., lymph, amniotic fluid, serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 
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The tissue distribution in 12 week old embryo indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the treatment and diagnosis 
of developmental anomalies, fetal deficiencies, pre-natal disorders and cancer. 
Similarly, expression within embryonic tissue and other cellular sources marked by 
proliferating cells indicates that this protein may play a role in the regulation of 
cellular division. Similarly, embryonic development also involves decisions involving 
cell differentiation and/or apoptosis in pattern formation. Thus, this protein may also 
be involved in apoptosis or tissue differentiation and could again be useful in cancer 
. therapy. Alternatively, the tissue distribution and homology to human blood group A 
and B glycosyitransferase enzymes indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of hematopoietic 
related disorders such as anemia, pancytopenia, leukopenia, thrombocytopenia or 
leukemia since stromal cells are important in the production of cells of hematopoietic 
lineages. Representative uses are described in the "Immune Activity" and "infectious 
disease" sections below, in Example 11,13, 14, 16, 18, 19, 20, and 27, and elsewhere 
herein. Briefly, the uses include bone marrow cell ex-vivo culture, bone marrow 
transplantation, bone marrow reconstitution, radiotherapy or chemotherapy of 
neoplasia. 

The gene product may also be involved in lymphopoiesis, therefore, it can be 
used in immune disorders such as infection, inflammation, allergy, immunodeficiency 
etc. In addition, this gene product may have commercial utility in the expansion of 
stem cells and conunitted progenitors of various blood lineages, and in the 
differentiation and/or proliferation of various cell types. Furthermore, the protein may 
also be used to determine biological activity, to raise antibodies, as tissue markers, to 
isolate cognate ligands or receptors, to identify agents that modulate their interactions, 
in addition to its use as a nutritional supplement. Protein, as well as, antibodies 
directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

iMany polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID N0:31 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
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excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 1274 of SEQ ID NO:31. b is an 
5 integer of 15 to 1288, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:3 1 . and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

The translation product of this gene shares sequence homology with CD97 

10 (EMRI), which is thought to be important in both adhesion and signaling processes 
early after leukocyte activation (See» e.g., Genbank Accession No. gil784994; all 
references available through this accession are hereby incorporated by reference 
herein). EMRI belongs to a novel family of G-protein receptors that has recently been 
recognized on the basis of homologous primar>' amino acid sequences, comprises 

15 receptors to hormones of the secretin/ vasoactive intestinal peptide/glucagon family, 
parathyroid hormone and parathyroid hormone-related peptides, growth hormone- 
releasing factor, conicotropin-releasing factor, and calcitonin. Proteins with seven 
transmembrane segments (7TM) define a superfamily of receptors (7TMreceptors) 
sharing the same topology: an extracellular N-terminus, three extramembranous loops 

20 on either side of the plasma membrane, and a cytoplasmic C-terminal tail. Upon 
ligand binding, cytoplasmic portions of the activated receptor interact with 
heterotrimeric G-coupled proteins to induce various second messengers, which 
subsequently activate various signal transduction pathways depending upon the 
specific G-coupled protein associated with the receptor. Preferred polypeptides of the 

25 invention comprise the following amino acid sequence: CFKRKPKREHCSCP 

ITYQSLGDILNASFFSKRKGMQEVKLNSYVVSGTIGLKEKISLSEPVFLTFRHN 
QPGDKRTKHICVYWEGSEGGRWSTEGCSHVHSNGSYTKCKCFHLSSFAVLV 
ALAPKEDPVLTVITQVGLTISLLCLFLAILTFLLCRPIQNTSTSLHLELSLCLFLA 
HLLFLTGINRTEPEVLCSHAGLLHFLYLACFTWMLLEGLHLFLTVRNLKVAN 

30 YTSTGRFKKRFMYPVGYGIPAVIIAVSAIVGPQNYGTFTHCWLKLDKGFIWSF 
MGPVAVIILINLVFYFQVLVVILRSKLSSLNKEVSTIQDTRVMTFKAISQLFILGC 
SWGLGFFMVEEVGKTIGSIIAYSFTIINTLQGVLLFVVHCLLNRQVRMEYKKW 
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FSGMRKGVETESTEtVlSRSTTQTKTEEVGKSSEIFHKGGTASSSAESTKQPQPQ 
VHLVSAAVVLKMN (SEQ ID NO: 298), and/or FRVKENLRRNGSREDFARRATQ 
LIQSVELSIWNASFASPGKGQISEFDIVYETKRCNETRENAFLEAGNNTMDINC 
ADALKGNLRESTAVALSLINLLGIFSEQ ID NO: 299. Polynucleotides 
encoding these polypeptides are also provided. 

Included in this invention as preferred domains are two EGF-like protein 
domains, which were identified using the ProSite analysis tool (Swiss Institute of 
Bioinformatics). First, a sequence of about forty amino-acid residues long found in 
the sequence of epidermal growth factor (EGF) has been shown to be present in a 
large number of membrane-bound and extracellular mostly animal proteins. Many of 
these proteins require calcium for their biological function and a calcium-binding site 
has been found to be located at the N-term.inus of some EGF-like domains. Calcium.- 
binding is crucial for numerous protein-protein interactions. We have used the N- 
terminal part of the EGF domain as a consensus pattern. It includes the negative N- 
terminus and the possible hydroxylation site. The consensus pattern is as 
foIlows:[DEQN|.[DEQNl{2}C.{3J4}C.{3J}C.IDNI,{4}IFYl.C (The four Cs are 
involved in disulfide bonds|. 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: DINECETGLAKCKYKAYCRNKVGGYIC (SEQ ID NO: 300). 
Polynucleotides encoding these polypeptides are also provided. Secondly, post- 
translational hydroxylation of aspartic acid or asparagine to form erythro-beta- 
hydroxyaspartic acid or erythro-beia-hydroxyasparagine has been identified in a 
number of proteins with domains homologous to (EGR. Based on sequence 
comparisons of the EGF-honiology region that contains hydroxylated Asp or Asn, a 
consensus sequence located in the N-terminal of EGF-like domains has been 
identified that seems to be required by the hydroxylase! s). The consensus sequence is 
as follows: C.[DN1.{4}|FY |.C.C. 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: CRNKVGGYICSC (SEQ ID NO: 301). Polynucleotides encoding these 
polypeptides are also provided. 

Further preferred are polypeptides comprising the calcium-binding EGF-like 
domain and aspartic acid and asparagine hydroxylation site listed above, and at least 
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5. 10, 15. 20, 25. 30, 50, or 75 additional contiguous amino acid residues of the 
sequence referenced in Table I for this gene and the embodiments listed herein. The 
additional contiguous amino acid residues is N-terminal or C- terminal to one or both 
of the listed domains. Alternatively, the additional contiguous amino acid residues is 
both N-terminal and C-terminal to one or both of the listed domains, wherein the total 
N- and C-terminal contiguous amino acid residues equal the specified number. The 
above preferred polypeptide domains are characteristic of a signature specific to EGF 
like proteins. Based on the sequence similarity and conserved domains. The 
translation product of this gene is expected to share at least some biological activities 
with EGF-like proteins. Such activities are known in the art, some of which are 
described elsewhere herein. 

This gene is expressed primarily in eosinophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type^s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, hematopoietic disorders or anemias and leukemias, 
immunodeficiencies, infection, lymphomas, auto-immunities and cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune and 
hematopoietic systems, expression of this gene at significantly higher or lower levels 
is routinely detected in certain tissues or cell types (e.g.. immune, hematopoietic, or 
cancerous and wounded tissues.) or bodily fluids (e.g., lymph, serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 145 as residues: Ser-22 to Ser-30. Pro-33 to Cys-48, 
Asp-50 to Lys-67. Pro-1 17 to Ser-130. Polynucleotides encoding said polypeptides 
are also provided. 
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The tissue distribution in eosinophils combined with its homology to a known 
human seven transmembrane domain protein indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the diagnosis and treatment of 
cancer and other proliferative disorders, particularly considering the fact that the 
majority of 7 transmembrane receptors are tightly associated with signal transduction 
pathways which are integral to the modulation of the cell cycle. As such, the protein 
product of this gene may play a role in the regulation of cellular division, where loss 
of regulation may result in proliferating cells and the onset of tumors or cancer. 
Additionally, the expression in hematopoietic cells and tissues indicates that this 
. protein may play a role in the proliferation, differentiation, and/or survival of 
hematopoietic cell lineages. In such an event, this gene is useful in the treatment of 
lymphoproliferative disorders, and in the maintenance and differentiation of various 
hematopoietic lineages from early hematopoietic stem and committed progenitor 
cells. Similarly, embryonic development also involves decisions involving cell 
differentiation and/or apoptosis in pattern formation. Thus this protein may also be 
involved in apoptosis or tissue differentiation and could again be useful in cancer 
therapy. Similarly, the tissue distribution and homology to CD97 indicates that the 
protein product of this gene might be a marker for differentiation and activation of 
..eosinophils, and therefore is useful for the diagnosis and treatment of immune 
•disorders including: leukemias. lymphomas, auto-immunitics. immunodeficiencies 
(e.g., .AIDS), immuno-supressive conditions (transplantation) and hematopoietic 
disorders. Representative uses are described in the "Immune .Activity" and "infectious 
disease" sections below, in Example 11. 13, 14, 16, 18. 19, 20, and 27, and elsewhere 
herein. In addition this gene product is applicable in conditions of general microbial 
infection, inflammation or cancer. Furthermore, the protein may also be used to 
determine biological activity, to raise antibodies, as tissue markers, to isolate cognate 
ligands or receptors, to identify agents that modulate their interactions, in addition to 
its use as a nutritional supplement. Protein, as well as. antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the 
above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these .sequences are 
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related to SEQ ID NO:32 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 3266 of SEQ ID NO:32, b is an 
integer of 15 to 3280, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:32, and where b is greater than or equal to a + 14. 

FEATURES OF PROTELN ENCODED BY GENE NO: 23 

The translation product of this gene has been found to have homology to the 
rat neural F box protein NFB42. in addition to a conserved Caenorhabditis elegans 
CI4B1.3 protein (See, e.g., Genbank Accession Nos. gil38516481gb(AAC97505.1l 
(AF098301) and gil558270: all references available through these accessions are 
hereby incorporated by reference herein). Preferred polypeptides of the invention 
comprise the following amino acid sequence: ALCPHPHLILNVTVSPAPSCRHVK * 
KVVASPSPSTTMIAMDAPHSKAALDSINELPENILLELFTHVPARQLLLNCRL 
VCSLWRDLIDLMTLWKRKCLREGFITKDWDQPVADWKIF^TLRSLHRNLLR 
NPCAEEDMFAWQIDFNGGDRWKVESLPGAHGTDFPDPKVKKYFVTSYEMCL 
KSQLVDLVAEGYWTELLDTFRPDIVVKDWFAARADCGCTYQLKVQLASA 
DYFVLASFEPPPVTIQQWNNATWTEVSYTFSDYPRGVRYILFQHGGRDTQY 
VVAGWYGPRVTNSSIVVSPKMTRNQASSEAQPGQKHGQEEAAQSPYRAVV 
QIF (SEQ ID NO: 302). Polynucleotides encoding these polypeptides are also 
provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
1. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 1. 

This gene is expressed primarily in immune cells, especially primary dendritic 
cells, and T cells, and to a lesser extent in a variety of other tissues including breast, 
keratinocytes, epididiymus (cauda). lung, multiple sclerosis, endometrial stromal 
cells. IL4 induced umbilical vein endothelial cells, fetal kidney, fetal dura mater, 
rejected kidney, and osteoblasts. 



wo 00/06698 



66 



PCT/IJS99/17130 



Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer and other proliferative disorders, particularly of the immune 
system or endothelial cells. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
- identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 146 as residues: Pro-41 to Cysr47, Phe-52 to Gly-59. 
Pro-62 to His-70. Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in immune cells indicates that polynucleotides and 
-polypeptides corresponding to this gene are useful for the diagnosis and treatment of a 
iv'ariety of immune system disorders. Representative uses are described in the 
"Immune Activity" and "infectious disease-' sections below, in Example 1 1, 13, 14, 
16. 18, 19, 20. and 27, and elsewhere herein. Briefly, the expression of this gene 
product in T-cells indicates a role in regulating the proliferation: survival; 
differentiation: and/or activation of hematopoietic cell lineages, including blood stem 
cells. This gene product is involved in the regulation of cytokine production, antigen 
presentation, or other processes suggesting a usefulness in the treatment of cancer 
(e g - by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also used as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou s Di.sea.se, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities. 
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such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as hosi-versus-grafi and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood cells, or that 
recruits hematopoietic cells to sites of injury. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. 

The secreted protein can also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate igands or receptors, to identify agents 
that modulate their interactions and as nutritional supplements. It may also have a 
very wide range of biological activities although no evidence for any is provided in 
the specification. Typical of these are cytokine, cell proliferation/differentiation 
modulating activity or induction of other cytokines: 

immunostimulating/immunosuppressant activities (e.g., for treating human 
immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of haematopoiesis (e.g., for treating anaemia or as adjunct to 
chemotherapy): stimulation of growth of bone, canilage. tendons, ligaments and/or 
nerves (e.g.. for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility); chemoiactic and chemokinetic activities (e.g. for treating 
infections, tumours): haemostatic or thrombolytic activity (e.g., for treating 
haemophilia, cardiac infarction etc.); anti-inflammatory activity (e.g.. for treating 
septic shock. Crohn's Disease); as antimicrobials; for treating psoriasis or other 
hyperproliferaiive disease: for regulation of metabolism, behaviour, and many others. 
Also contemplated is the use of the corresponding nucleic acid in gene therapy 
procedures. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:33 and may have been publicly available prior to conception of 
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the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
5 formula of a-b, where a is any integer between 1 to 1283 of SEQ ID NO:33, b is an 
integer of 15 to 1297, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:33. and where b is greater than or equal to a + 14. 

' FEATURES OF PROTEIN ENCODED BY GENE NO: 24 
10 The translation product of this gene shares sequence homology with the 

human, mouse, and bovine dopamine hydroxylase which is thought to be important in 
the modification of dopamine, a neurotransmitter (See Genbank A.ccess!on Nos. 
gil30474, gil 162965, and/or gi!2358082; all references available through these 
accessions are hereby incorporated by reference herein). Preferred polypeptides of the 
15 invention comprise the following amino acid sequence: RQRSWNPGT 

NCYHPNMPDAFLTCETVIFAWAIGGEGFSYPPHVGLSLGTPLDPHYVLLEVH 
YDNPTYEEGLIDNSGLRLPi'TMDIRKYDAGVIEAGLWySLFHTlPPGMPEF 
QSEGHCTLECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQVLKEEQTILPGDNLITECRYNTKDRAEMTWGGLSTR 
20 SEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEIYRPVTTWPFIIKSPKQYK 
NLSFMDAMNKFKWTKKECLSFNKLVLSLPVNVRCSKTDNAEWSIPRNDSIT 
SRYRKTL (SEQ ID NO: 303). Polynucleotides encoding these polypeptides are also 
provided. 

A preferred polypeptide fragment of the invention comprises the following 
25 amino acid sequence: MCCWPLLLLWGLLPGTAAGGSGRTYPHRTLLDSEGK 
YWLGWSQRGSQIAFRLQVRTAGYVGFGFSPTGAMASADIVVGGVAHGR 
PYLQDY FTNANRELKKDAQQDYHLEYAMENSTHTIIEFTRELHTCDINDKS 
ITDSTVRVIWAYHHE DAGEAGPKYHDSNRGTKSLRLLNPEKTSVLSTALPYF 
DLVNQDVPIPNKDTTYWCQ.VIFKIPVFQEKHHVIKVEPVIQRGHESLVHHILL 
30 YQCSNNFNDSVPGIRARIAITPTCP.VIHSSPV KL (SEQ ID NO: 304). 
Polynucleotides encoding these polypeptides are also provided. 
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This gene is expressed primarily in brain, the pulmonary system, and to a 
lesser extent in kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, neurological and behavioral disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the nervous system, expression 
of this gene at significantly higher or lower levels is routinely detected in certain 
tissues or cell types (e.g., neural, endocrine, or cancerous and wounded tissues) or 
bodily fluids (e.g., sputum, lymph, serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 147 as residues: Ser-33 to Trp-38. Gly-40 to Gly-45, 
Asn-93 to Asp-105, Thr-128 to Thr-137, Glu-150 to Ly.s-167. Pro-197 to Tyr-203. 
Cys-242 to Asn-247. Ser-253 to Tyr-258, His-307 to Glu-3 14. Glu-357 to Gly-362, 
Trp-373 to Gln-378, Ser-402 to Glu-408.' Polynucleotides encoding said polypeptides 
are also provided. 

The tissue distribution in brain and homology to a protein involved in the 
modification of dopamine indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 
Representative uses are described in the "Regeneration" and "Hyperproliferative 
Disorders" sections below, in Example II. 15, and 18. and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease. Parkinson's Disease, Huntington's Disease. Tourette Syndrome, 
meningitis, encephalitis, demyelinating di.seases. peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
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compulsive disorder* depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, the gene or gene product may also play 
a role in the treatment and/or detection of developmental disorders associated with the 
developing embryo, sexually-linked disorders, or disorders of the cardiovascular 
system. Alternatively, the homology to dopamine hydroxylase indicates that 
^polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of various endocrine disorders and cancers, 
.particularly Addison's Disease, Cushing's Syndrome, and disorders and/or cancers of 
the pancreas (e.g., diabetes mellitus), adrenal cortex, ovaries, pituitary (e.g., hyper-, 
hypopituitarism), thyroid (e.g., hyper-, hypothyroidism), parathyroid (e.g., hyper- 
. hypoparathyroidism), hypothallamus. and testes. Furthermore, the protein may also 
be used to determine biological activity, to raise antibodies, as tissue markers, to 
isolate cognate ligands or receptors, to identify agents that modulate their interactions, 
in addition to its use as a nutritional supplement. Protein, as well as. antibodies 
directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:34 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2 1 70 of SEQ ID NO:34, b is an 
integer of 15 to 2184, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:34, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

When tested against Jufkat T-cell lines, supematants removed from cells 
containing this gene activated the gamma activating sequence (GAS), a promoter 
element found upstream of many genes which are involved in the Jak-STAT pathway. 
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The Jak-STAT pathway is a iarge» signal iransduction pathway involved in the 
differentiation and proliferation of cells. Therefore, activation of the Jak-STAT 
pathway, reflected by the binding of the GAS element, can be used to indicate 
proteins involved in the proliferation and differentiation of cells. Thus, it is likely that 
this gene activates T-cells through the JakStat signal transduction pathway. 

This gene is expressed in a variety of human normal and diseased tissues 
including breast, infant adrenal gland, skin tumor, colon, pituitary. Wilms tumor, and 
to a lesser extent in other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, breast cancer and other proliferative disorders, afflicting endocrine or 
endothelial tissues. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, panicularly of the endocrine system or of breast and/or breast lymph 
nodes, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g.. reproductive, endocrine, or cancerous 
and wounded tissues) or bodily fluids (e.g., lymph, serum. pla.sma. urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression 
level in healthy ti.ssue or bodily fluid from an individual not having the disorder. 

The tissue distribution in breast, infant adrenal gland, skin tumor, colon, 
pituitary, and Wilm's tumor, and biological activity in activating the G.AS promoter 
element indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the detection, treatment, and/or prevention of various endocrine 
disorders and cancers, particulady Addison's Disease, Cushing's Syndrome, and 
disorders and/or cancers of the pancreas (e.g., diabetes mellitus). adrenal cortex, 
ovaries, pituitary (e.g., hyper-, hypopituitarism), thyroid (e.g., hyper-, 
hypothyroidism), parathyroid le.g., hyper-, hypoparathyroidism) , hypothalJamus. and 
testes. Alternatively, the tissue distribution and biological activity indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
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diagnosis and treatment of cancer and other proliferative disorders. Expression within 
embryonic tissue and other cellular sources marked by proliferating cells, (i.e.. breast, 
skin and Wilm's tumors) indicates that this protein may play a role in the regulation of 
cellular division. Additionally, the expression in hematopoietic cells and tissues 
indicates that this protein may play a role in the proliferation, differentiation, and/or 
survival of hematopoietic cell lineages. Representative uses are described in the 
"Immune Activity" and "infectious disease" sections below, in Example 1 1, 13, 14, 
16, 18, 19, 20, and 27. and elsewhere herein. In such an event, this gene is useful in 
the treatment of lymphoproliferative disorders, and in the maintenance and 
differentiation of various hematopoietic lineages from early hematopoietic stem and 
committed progenitor cells. Similarly, embryonic development also involves 
decisions involving cell differentiation and/or apoptosis in pattem formation. Thus 
this protein may also be involved in apoptosis or tissue differentiation and could again 
be useful in cancer therapy. Furthermore, the protein may also be used to determine 
biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as, antibodies directed against the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 
tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through .sequence databases. Some of the.sc sequences are 
related to SEQ ID NO:35 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide .sequence described by the general 
formula of a-b, where a is any integer between 1 to 935 of SEQ ID NO: 35. b is an 
integer of 15 to 949. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:35, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

In another embodimeni. polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: TGTFWSPRSQRRGCCGRRAPRPEAMENGAVYS 
PTTEEDPGPARGPRSGLAAYFFMGRLPLLRRVLKGLQLLLSLLAnCEEVVSQ 
CTLCGGLYFFEFVSCSAFLLSLLILIVYCTPFYERVDTTKVKSSDFYITLGTGCV 
FLLASIIFVSTHDRTSAEIAAIVFGFIASFMFLLDFITMLYEKRQESQLRKPENTT 
RAEALTEPLNA (SEQ ID NO: 305). Polynucleotides encoding these polypeptides 
are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
3. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 3. 

This gene is expressed primarily in dendritic cells, and to a lesser extent in 
melanocytes, fetal liver and spleen and several other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, inflammation, and disorders of the hepatic and immune systems. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, panicularly of the 
immune and hematopoietic systems, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., hematopoietic, 
hepatic, immune, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 149 as residues: Phe-63 to Ser-75, Thr-97 to Ser-102, 
Glu-128 to Arg-143. Polynucleotides encoding said polypeptides are ahso provided. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this, gene are useful for the diagnosis and treatment of a variety of 
immune system disorders. Representative uses are described in the "Immune 
Activity" and "infectious disease" sections below, in Example 11, 13, 14, 16, 18. 19, 
20, and 27, and elsewhere herein. Briefly, the expression of this gene product in 
dendritic cells indicates a role in the regulation of the proliferation: survival; 
differentiation; and/or activation of potentially all hematopoietic cell lineages, 
including blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes that may also suggest a usefulness 
in the treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also used as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
' Disease, scleroderma and tissues. Moreover, the protein may represent a secreted 
factor that influences the differentiation or behavior of other blood cells, or that 
recruits hematopoietic cells to sites of injur\'. In addition, this gene product may have 
commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell 
types. Alternatively, the tissue distribution in fetal liver indicates that polynucleotides 
and polypeptides correspondmg lo this gene are useful for the detection and treatment 
of liver disorders and cancers (e.g.. hepatoblastoma, jaundice, hepatitis, liver 
metabolic diseases and conditions that are auribuiable to the differentiation of 
hepatocyte progenitor cells). In addition the expression in fetus would suggest a 
useful role for the protein product in developmental abnormalities, fetal deficiencies, 
pre-natal disorders and various would-healing models and/or tissue trauma. 
Furthermore, the protein may also be u.sed to determine biological activity, raise 
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antibodies, as tissue marlcers. to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions, in addition to its use as a nutritional supplement. 
Protein, as vvell as. antibodies directed against the protein may show utility as a tumor 
marker and/or immunotherapy targets for the above listed tissues. 

5 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequente databases. Some of these sequences are 
related to SEQ ID NO:36 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the pre.sent invention. To list every related sequence is 

10 cumbersome. Accordingly; preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 3324 of SEQ ID .N'0:36. b is an 
integer of 15 to 3338. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:36. and where b is greater than or equal to a + 14. 



15 



FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

In a specific embodiment, polypeptides comprising the amino acid sequence 
of the open reading frame upstream of the predicted signal peptide are contemplated 
by the present invention. Specifically, polypeptides of the invention comprise the 
20 following amino acid sequence: ASAPRV.MRGHLAGFPALSGLASVCLWATFSA 
QLPGPVA.ATSWTPAPLGCSAARSGPEKRLGTAAPGSAASLAQAGPGAPCRV 
LPVDPAP.AALNVREPGWLGGLFDGALLQVLLNFLRKSTDVLMDTREAESLEV 

E (SEQ ID NO: 306). 

In another embodiment polypeptides of the invention comprise the following 

25 amino acid sequence: NKLHSFPVFLSQLLLDRQLLHAPQTLPTPHCGGSSRPGP 
SHPPWLLIQLPCVHVALWQMLRDFSDSRITPSTLTTQPAAQTAAPAKDQES 
DIVGGEGILCDIAFLQEDHPLGVGGASAPSSRRELSRRGVHTQTLPEDGTLHG 
TPSSSFDCGIKYIISVVPLAPGCDLPSLELSLVCKGVSSCMGFAAG (SEQ ID NO: 
307). Polynucleotides encoding these polypeptides are also provided. 

30 This gene is expressed primarily in endothelial cells, lung, and fetal kidney, 

and to a lesser extent in epididymis, keratinocytes and cerebellum. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cardiovascular diseases involving endothelial cell disturbances such as 
atheroschlerosis. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell typef s). For a number of disorders of the above tissues or cells, 
particularly of the cardiovascular system, expression of this gene at significantly 
higher or lower levels is routinely detected in certain tissues or cell types (e.g., 
cardiovascular, or cancerous and wounded tissues) or bodily fluids (e.g., lymph, 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 150 as residues: Arg-47 to Leu-54. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in endothelial cells indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosing and treating 
disorders of endothelial cells such as atheroschlerosis, vasculitis, cardiovascular 
disease, and emphysema. The secreted protein can also be used to determine 
biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions and as nutritional 
supplements. The polypeptide may possess a wide range of undetected biological 
activities. Typical of these are cytokine, cell proliferation/differentiation modulating 
activity or induction ofother cytokines: immunostimulating/immunosuppressant 
activities (e.g., for treating human immunodeficiency virus infection, cancer, 
autoimmunediseases and allergy): regulation of haematopoiesis (e.g., for 
treatinganaemia or as adjunct to chemotherapy ): stimulation of growth of 
bone,cartilage, tendons, ligaments and/or nerves (e.g., for treating wounds.stimulation 
of follicle stimulating hormone (for control of fertility): chemotactic and 
chemokinetic activities (e.g., for treating infections, tumours): haemostatic or 
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thrombolytic activity (e.g., for treating haemophilia, cardiac infarction etc.); anti- 
inflammatory activity (e.g., for treating septic shock, Crohn's Disease); as 
antimicrobials: for treating psoriasis or other hyperproliferaiive disease: for regulation 
of metabolism, behaviour, and many others. Also contemplated is the use of the 
corresponding nucleic acid in gene therapy procedures. Protein, as well as. antibodies 
directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:37 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1549 of SEQ ID NO:37, b is an 
integer of 15 to 1563, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:37, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: PGRPTRPTKNKVCVCLGMLFWAYPICVFIDSL 
SCQPCLWSTGATSHFNSPTTSPLFTLFiVIPCALAPNPFT QLGKLDDR (SEQ ID 
NO: 308). Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in meningima. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s ) or cell type(s ) present in a 
biological sample and for diagnosis of disea.ses and conditions which include, but are 
not limited to, tumors or disorders of the central nervous system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissuei s) or cell type(s). For 
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a number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g., neural, or cancerous and wounded 
tissues) or bodily fluids (e.g.. lymph, serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
- tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 151 as residues: His-29 to Thr-34. Polynucleotides 
. encoding said polypeptides are also provided. 

The tissue distribution in meningima indicates polynucleotides and 
polypeptides corresponding to this gene are useful for the detection, treatment, and/or 
prevention of neurodegenerative disease states, behavioral disorders, or inflammatory 
conditions. Representative uses are described in the "Regeneration" and 
"Hyperproliferative Disorders'* sections below, in Example 11. 15, and 18, and 
. elsewhere herein. Briefly, the uses include, but are not limited to the detection, 
treatment, and/or prevention of Alzheimer's Disease, Parkinson's Disease, 
• Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
-cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities. ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception, as well as disorders of 
the meninges such as meningioma and meningitis. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
disorders associated with the developing embr>'o, sexually-linked disorders, or 
disorders of the cardiovascular system. Furthermore, the protein may also be used to 
determine biological activity, to raise antibodies, as tissue markers, to isolate cognate 
ligands or receptors, to identify agents that modulate their interactions, in addition to 
Its use as a nutritional supplement. Protein, as well as. antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the 
above listed tissues. 
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Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 38 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
5 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1034 of SEQ ID NO:38, b is an 
integer of 15 to 1048, where both a and b correspond to the positions of nucleotide 
10 residues shown in SEQ ID NO:38, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 29 

The translation product of this gene has been shown to encode a human brain 
specific mitochondrial carrier (Genbank Accession No. gil3851540lgbl A AD04346.il 
15 (AF078544): all references available through this accession are hereby incorporated 
herein by reference) which shares sequence homology with the human body weight 
disorder associated gene C5 product which is known to be differentially expressed in 
obese compared to lean mice (See GeneSeq Accession No. R912SI). Based on the 
sequence similarity, the translation product of this gene is expected to share at least 

20 some biological activities with mitochondrial carriers proteins. Such activities are 
known in the an, some of which are described in Sanchis et aL J. Biol. Chem. 
273:3461 1-34615 (1998). incorporated herein by reference. 

Included in this invention as preferred domains are mitochondrial energy 
transfer protein (METP) domains, which were identified using the ProSiie analysis 

25 tool (Swiss Institute of Bioinformatics). Structurally, members of the family of 

mitochondrial energy transfer proteins consist of three tandem repeats of a domain of 
approximately one hundred residues. Each of these domains contains two 
transmembrane regions. As a signature pattern, we selected one of the most conserved 
regions in the repeated domain, located just after the first transmembrane region. To 

30 detect this widespread family of proteins, a consensus sequence was developed that 
contains the most conserved regions in the repeated domain. The consensus pattern is 
as follows; P.[DE].[LIVAT][RK].fLRH]fLIVMFY]IQMAIGV]. 
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Preferred polypeptides of the invention comprise the following amino acid 
. sequences: PVDLTKTRLQ (SEQ ID NO: 309) and PTDVLKIRMQ (SEQ ID NO: 
310). Polynucleotides encoding these polypeptides are also provided. 

Further preferred are polypeptides comprising the METP domains of the 
sequence listed above, and at least 5, 10. 15. 20, 25. 30. 50. or 75 additional 
contiguous amino acid residues of the sequence referenced in Table I for this gene. 
. The additional contiguous amino acid residues is N-terminal or C- terminal to the 
.. iMETP domain. Alternatively, the additional contiguous amino acid residues is both 
N-terminal and C-terminal to the METP domain, wherein the total N- and C-terminal 
contiguous amino acid residues equal the specified number. The above preferred 
polypeptide domain is characteristic of a signature specific to mitochondrial energy 
transfer proteins. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
X. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome X. 

This gene is expressed primarily in brain, and to a lesser extent, in T-cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typei s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. neurological and behavioral disorders and immune disorders and/or 
obesity. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the disgestive. immune, and nervous systems, expression of this gene 
at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g.. immune, neural, or cancerous and wounded tissues) or bodily fluids (e.g.. 
lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: i52 as residues: Gln-189 to Gly-195. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in brain and homology to mitochondrial carrier proteins 
indicates polynucleotides and polypeptides corresponding to this gene are useful for 
the detection, treatment, and/or prevention of neurpdegenerative disease states, 
behavioral disorders, or innammatory conditions. Representative uses are described in 
the "Regeneration" and "Hyperproliferative Disorders" sections below, in Example 
1 1. 15, and 18, and elsewhere herein. Briefly, the uses include, but are not limited to 
the detection, treatment, and/or prevention of Alzheimer's Disease. Parkinson s 
Disease. Huntington's Disease. Tourette Syndrome, meningitis, encephalitis, 
demyelinating diseases, peripheral neuropathies, neoplasia, trauma, congenital 
malformations, spinal cord injuries, Lschemia and infarction, aneurysms, hemorrhages, 
schizophrenia, mania, dementia, paranoia, obsessive compulsive disorder, depression, 
15 panic disorder, learning disabilities. ALS. psychoses, autism, and altered behaviors, 
including disorders in feeding, sleep patterns, balance, and perception. In addition, 
elevated expression of this gene product in regions of the brain indicates it plays a 
role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Funhermore. the protein may also be used to determine biological activity, 
to rai.se antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some ofthe.se sequences are 
related to SEQ ID NO:39 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide .sequence described by the general 
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formula of a-b. where a is any integer between 1 to I4I6 of SEQ ID NO:39. b is an 
integer of 15 to 1430, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:39. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

Preferred polypeptides of the invention comprise the following amino acid 
. sequence: MTFGSTISPTSTHASPSLGFCCSWLLEDLEEQLYCSAFEEAALTR 
. . RICNPTSCWLPLDMELLHRQ VLALQTQRVLLGMWLRRA WDTWVSPRRVAP 
GSRCLLTASHPCTEKRRKASAXQRNLGYPLAMLCLLVLTGLSVLIVAIHILEL 
LIDEAAMPRGMQGTSLGQVSFSKLGSFGAVIQVVLIFYLMVSSVVGFYSSPLF 
RSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGLTRFDLLGDFGRFNWL 
GNFYIVFLYNAAFAGLTTLCLVKTFTAAVRAELIRAFGLDRLPLPVSGFPOAS 
RKTQHQ (SEQ ID NO: 3 II ). Polynucleotides encoding such polypeptides are also 
provided. 

This gene is expressed primarily in immune system tissues (e.g. resting T- 
cells. primary dendritic cells, and neutrophils, apoptotic T-cells) and umbilical vein. 
This gene is expressed to a lesser extent in the gastrointestinal tissue (e.g. small 
intestine, colon), brain (e.g. cerebellum, frontal cortex), aorta endothelial cells, skin 
tumor, embryonic tissue, thymus, and cancers (e.g. cheek, breast, synovial). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. cancer and immune disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are u.seful in providing immunological probes for 
differential identification of the lissue(.s) orcell typets). For a number of disorders of 
the above tissues or cells, particularly of the immune system and gastrointestinal tract 
expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues orcell types (e.g.. immune, gastrointestinal, cancerous and wounded 
tissues) or bodily fluids (e.g.. amniotic, .serum. pla.sma. urine, synovial fluid and 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy ti.ssue or bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 153 as residues: Asp-21 to Ser-29. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in immune cells (e.g. T-cells. dendritic cells. 
J neutrophils) indicates polynucleotides and polypeptides corresponding to this gene are 
u.seful for the diagnosis and treatment of a variety of immune system disorders. 
Representative uses are described in the "Immune Activity" and "infectious disease- 
sections below, in E.xample 11.13, 14. 16. 18, 19. 20, and 27. and elsewhere herein. 
Briefly, the expression of this gene product indicates a role in regulating the 
proliferation; survival: differentiation: and/or activation of hematopoietic cell 
lineages, including blood stem cells. This gene product is involved in the regulation of 
cytokine production, antigen presentation, or other processes suggesting a usefulness • 
in the treatment of cancer (e.g. by boosting immune responses). 

Since the gene is e.xpressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory • 
bowel disease, sepsis, acne, neutropenia, neutrophilia, p.soriasis. hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility. Ien.se t..ssue injury, demyelination. sysiemic 
iupus eo'thematosis. drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of vanous cell types. The tissue distribution 
m skin tumors and cancerous tissue (e.g. cheek, breast, synovial .sarcoma, indicates 
that polynucleotides and polypeptides corre.^ponding to this gene are useful for the 
diagnosis and treatment of cancer and other proliferative disorders. Expres.sion in 
cellular sources such as embtyonic tissue marked by proliferating cells indicates that 
this protein may play a role in the regulation of cellular division. Additionally.- the 
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expression in hematopoietic cells and tissues indicates that this protein may play a 
role in the proliferation, differentiation, and/or survival of hematopoietic cell lineages. 
In such an event, this gene is useful in the treatment of lymphoproliferative disorders, 
and in the maintenance and differentiation of various hematopoietic lineages from 
early hematopoietic stem and committed progenitor cells. Similarly, embryonic 
development also involves decisions involving cell differentiation and/or apoptosis in 
pattern formation. Thus this protein may also be involved in apoptosis or tissue 
differentiation and could again be useful in cancer therapy. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. The tissue distribution in 
cerebellum and frontal cortex indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inHammatory conditions. 
Representative uses are described in the "Regeneration" and "Hyperproliferative 
Disorders" sections below, in Example 1 1. 15. and 18. and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Di.sea.se. Parkinson's Disease. Huntington's Disease, Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities. ALS. 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission. leammg, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as; antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
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Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:40 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2089 of SEQ ID NO:40, b is an 
integer of 15 to 2103, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:40, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 31 
The polypeptide of this gene has been determined to have a zinc finger ( Zinc finger, 
C2H2 type) domain at about amino acid position 16-50 of the amino acid sequence 
referenced in Table 1 for this gene. Therefore, 

A preferred polypeptide fragment of the invention comprises the following 
amino acid sequence: LCVCLVYLCMYGVCLCVIVCVSGVSLCLYVWGVSVC 
DCVSVFMCVCLCVIFCVYGKPRTEHYHSPHLAKQKAFREMCGRHDVSAAGIF 
QSYV (SEQ ID NO: 312). Polynucleotides encoding these polypeptides are also 
provided. 'Zinc finger domains are nucleic acid-binding protein structures first 
identified in the Xenopus transcription factor TFIIIA. These domains have since been 
found in numerous nucleic acid-binding proteins. A zinc finger domain is composed 
of 25 to 30 amino-acid residues. There are two cysteine or histidine residues at both 
extremities of the domain, which are involved in the tetrahedral coordination of a zinc 
atom. It has been proposed that such a domain interacts with about five nucleotides. A 
schematic representation of a zinc finger domain is shown below: 
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Many classes of zinc fingers are characterized according to the number and 
positions of the histidine and cysteine residues involved in the zinc atom coordination. 
In the first class to be characterized, called C2H2, the first pair of zinc coordinating 
residues are cysteines, while the second pair are histidines. A number of experimental 
reports have demonstrated the zinc- dependent DNA or RNA binding property of 
some mciiibers of this class. Some of the proteins known to include C2H2-type zinc 
fingers are listed below. We have indicated, between brackets, the numberof zinc 
finger regions found in each of these proteins: a symbol indicates that only partial 
sequence data is available and that additional finger domains is present. 
This gene is expressed primarily in salivary gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. salivary gland related diseases, diseases of the mouth, and other 
digestive disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the digestive system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., saliva, serum, plasma, 
urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 154 as residues: GIy-46 to His-54. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution indicates that the protein products of this gene are 
useful for diagnosis and treatment of salivary gland related diseases (mumps, calculi 
formation in ducts, sarcoidosis, facial palsy, tumors, Sjogrens Syndrome) and other 
digestive system disorders. Furthermore, the protein may also be used to determine 
biological activity, raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as. antibodies directed against the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 



tissues. 



Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:41 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 2335 of SEQ ID NO:41. b is an 
integer of. 15 to 2349. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:4I. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 32 

This gene is e.xpres.sed primarily in fetal tis.sue (e.g. spleen, liver, brain), 
cancerous tissues (e.g. ovarian, colon, stomach, parathyroid) and to a lesser extent in 
immune cells and tissue (e.g. B-cells. T-cells. bone marrow), and reproductive organs. 

Therefore, polynucleotides and polypeptides of the mvention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of disea.ses and conditions which include, but are 
not limited to. cancer, particularly of the colon and ovaries, disorders of the 
developing fetus, neurodegenerative conditions, and immune .system disorders. 
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Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g.. immune, reproductive, neural, 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 155 as residues: Lys-35 to Lys-47. Polynucleotides 
encoding said polypeptides are also provided. 

The expression of this gene within fetal tissue and other cellular sources marked by 
proliferating cells indicates this protein may play a role in the regulation of cellular 
division, and may show utility in the diagnosis, treatment, and/or prevention of 
developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" .sections below and elsewhere herein. Briefiy, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
20 formation. 

Dysregulation of apoptosis can resuh in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are u.seful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is u.seful in 
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modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. The tissue distribution 
in immune cells (such as T-cells and B-cells) and immune tissues (bone marrow) 
indicates polynucleotides and polypeptides corresponding to this gene are useful for 
the diagnosis and treatment of a variety of immune system disorders. Representative 
uses are described in the "Immune Activity" and "infectious disease" sections below, 
in Example 1 1, 13, 14, 16, 18, 19, 20. and 27, and elsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation; 
survival; differentiation; and/or activation of hematopoietic cell lineages, including 
blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes suggesting a usefulness in the 
treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination. systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood . lineages, and 
in the differentiation and/or proliferation of various cell types. The tissue distribution 
in parathyroid indicates polynucleotides and polypeptides corresponding to this gene 
are useful for the detection, treatment, and/or prevention of various endocrine 
disorders and cancers. Representative uses are described in the "Biological Activity", 
"Hyperproliferative Disorders", and "Binding Activity" sections below, in Example 
11. 17, 18, 19, 20 and 27. and elsewhere herein. Briefly, the protein can be used for 
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the detection, treatment, and/or prevention of Addison's Disease, Cushing's 
Syndrome, and disorders and/or cancers of the pancrease (e.g. diabetes mellitus), 
adrenal cortex, ovaries, pituitary (e.g., hyper-, hypopituitarism), thyroid (e.g. hyper-,, 
hypothyroidism), parathyroid (e.g. hyper-,hypoparathyroidism) , hypothaliamus, and 
testes. Additionally, the tissue distribution in brain tissue indicates polynucleotides 
and polypeptides corresponding to this gene are use/ul for the detection, treatment, 
and/or prevention of neurodegenerative disease stales, behavioral disorders, or 
inflammatory conditions. Representative uses are described in the "Regeneration" and 
"Hyperproliferative Disorders" sections below, in Example 11. 15, and 18, and 
elsewhere herein. Briefly, the uses include, but are not limited to the detection, 
treatment, and/or prevention of Alzheimer's Disease. Parkinson's Disease, 
Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
disea.ses, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive di.sorder. depression, panic disorder, 
learning disabilities, ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

xVIany polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:42 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
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more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1545 of SEQ ID NO:42, b is an 
integer of 15 to 1559. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:42, and where b is greater than or equal to a + 14, 

FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

When tested against U937 Myeloid cell lines, supemaiants removed from cells 
containing this gene activated the GAS assay. Thus, it is likely that this gene activates 
myeloid cells through the Jak-STAT signal transduction pathway. The gamma 
activating sequence (GAS) is a promoter element found upstream of many genes 
which are involved in the Jak-STAT pathway. The Jak-STAT pathway is a large, 
signal transduction pathway involved in the differentiation and proliferation of cells. 
Therefore, activation of the Jak-STAT pathway, reflected by the bmding of the GAS 
element, can be used to indicate proteins involved in the proliferation and 
differentiation of cells. 

This gene is expressed primarily in skin tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, skin disorders, paniculary skin cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissuefs) or cell type(s). For a number of 
disorders of the above tissues or cells, panicularly of the skin: expression of this gene 
at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunoeenic 
epitopes shown in SEQ ID NO: 156 as residues: Pro-38 to GIy-44, Phe-56 to Thr-64. 
Polynucleotides encoding .said polypeptides are also provided. 
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The tissue distribution in skin indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment, diagnosis, and/or prevention 
of various skin disorders including congenital disorders (i.e. nevi. moles, freckles, 
Mongolian spots, hemangiomas, port-wine syndrome), integumentary tumors (i.e. 
keratoses, Bowen's Disease, basal cell carcinoma, squamous cell carcinoma, 
malignant melanoma, Paget's Disease, mycosis fungoides, and Kaposi's sarcoma), 
injuries and inflammation of the skin (i.e.wounds, rashes, prickly heat disorder, 
psoriasis, dermatitis), atherosclerosis, uiicaria, eczema, photosensitivity, autoimmune 
disorders (i.e. lupus erythematosus, vitiligo, dermatomyositis, morphea, scleroderma, 
pemphigoid, and pemphigus), keloids, striae, erythema, petechiae, purpura, and 
xanthelasma. Moreover, such disorders may predispose increased susceptibility to 
viral and bacterial infections of the skin (i.e. cold sores, warts, chickenpox, 
moUuscum contagiosum. herpes zoster, boils, cellulitis, erysipelas, impetigo, tinea, 
althletes foot, and ringworm). Protein, as well as, antibodies directed against the 
protein may show utility as a tumor niarker and immunotherapy targets for the above 
listed tumors and tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:43 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 1752 of SEQ ID NO:43, b is an 
integer of 15 to 1766, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:43. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

The translation product of this gene shares sequence homology with mitogen- 
induced prostate carcinoma (mouse) which is thought to be important in the etiology 
of cancer. In this respect, this gene is miiogen-induced and/or involved in cell 
proliferation. 
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Preferred polypeptides of the invention comprise the following amino acid 
sequence: GHMPYGWLTEIRAVYPAFDKNNPSNKLVSTSNTVTAAHIKKF 
TFVCMALSLTLCFVMFWTPNVSEKILIDIIGVDRAPAELCVVPLRIFSFFPVPVT 
VRAHLTGWLMTLKKTFVLAPSSVLRIIVLIASLVVLPYLGVHGATLGVGSLLA 
GFVGESTMVAlAACYVYRKQKKKMENESATEGEDSAMTDMPPTEEVtDIVE 
MREENE (SEQ ID NO: 313) and/or QVVFVAILLHSHLECREPLLIPELSLYMGA 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLALILATQRISR 
PIVNLFVSRDLGGSSAATEAVAILTATYPV (SEQ ID NO: 314). Polynucleotides 
encoding these polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
5. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 5. 

This gene is expressed primarily in early infant and adult brain, retina, fetal 
tissue (e.g., liver, speen, whole embryo) and to a lesser extent in immune cells (e.g., 
monocytes and T-cells), colon, and parathyroid tumor tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancers, disorders of the immune system and nervous system. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential idcniitlcation of the tissue{s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
metabolic system (cancers), expression of this gene at significantly higher or lower 
levels is routinely detected in cenain tissues or cell types (e.g., immune, neural, 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 157 as residues: Arg-122 to Ser-139. Met- 144 to Glu- 
149. Polynucleotides encoding said polypeptides are aLso provided. 
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The tissue distribution and homology to mitogen induced prostate carcinoma 
(mouse) indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the study and treatment of cancers, including but not linnited to the 
colon, parathyroid, and adrenal glands. Moreover, the expression within fetal tissue 
and other cellular sources marked by proliferating cells indicates this protein may 
play a role in the regulation of cellular division, and may show utility in the diagnosis, 
treatment, and/or prevention of developmental diseases and disorders, including 
cancer, and other proliferative conditions. Representative uses are described in the 
"Hyperproliferative Disorders*' and "Regeneration" sections below and elsewhere 
-herein. Briefly, developmental tissues rely on decisions involving cell differentiation 
and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/qr preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. The tissue distribution 
in immune cells (T-cells. monocytes) indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of a variety of 
immune system disorders. Representative uses are described in the "Immune 
Activity" and "infectious disease" sections below, in Example 11. 13. 14. 16. 18, 19, 
20, and 27. and elsewhere herein. Briefly, the expression of this gene product 
indicates a role in regulating the proliferation; survival: differentiation: and/or 
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activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

5 Since the gene is expressed in cells of lymphoid origin, the natural gene 

product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 

10 such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination. systemic 
lupus erythemaiosis, drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 

15 influences the differentiation or behavior of other blood cells, or that recruits 

hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. The tissue distribution 
in brain indicates polynucleotides and polypeptides corresponding to this gene are 

20 useful for the detection, treatment, and/or prevention of neurodegenerative disease 
states, behavioral disorders, or inflammator>' conditions. Representative uses are 
described in the "Regeneration" and "Hypeiproliferative Di.sorders" sections below, in 
Example II, 15, and 18, and elsewhere herein. Briefly, the uses include, but are not 
limited to the detection, treatment, and/or prevention of Alzheimer's Disease, 

25 Parkinson's Di.sease. Huntington's Disease, Tourette Syndrome, meningitis, 

encephalitis, demyelinating disea.ses, peripheral neuropathies, neoplasia, trauma, 
congenital malformations, spinal cord injuries, ischemia and infarction, aneurysms, 
hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive compulsive 
disorder, depression, panic disorder, learning disabilities, ALS, psychoses, autism, 

30 and altered behaviors, including disorders in feeding, sleep patterns, balance, and 
perception. In addition, elevated expression of this gene product in regions of the 
brain indicates it plays a role in normal neural function. 



wo 00/06698 



96 



PCT/US99/17130 



Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 

- survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 

. utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:44 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 2558 of SEQ ID NO:44, b is an 
integer of 15 to 2572, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:44, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

This gene is expressed primarily in adult pulmonarv' tissue, umbilical vein, 
prostate, and fetal tissue (e.g. .heart). 

Therefore, polynucleotides and polypeptides of the invention are u.seful as 
reagents for differential identification of the tissue( s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, diseases of the pulmonary system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the pulmonary system, 
expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., pulmonar\'. cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
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Standard gene expression leveK i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 158 as residues: Arg-45 to Gly-51, Glu-75 to Asn-81. 
5 Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in pulmonary tissue indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection and treatment of 
disorders associated with developing lungs, particularly in premature infants where 
the lungs are the last tissues to develop. Additionally, the tissue distribution indicates 

10 that polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and intervention of lung tumors, since the gene is involved in the regulation 
of cell division, particularly since it is expressed in fetal tissue. Moreover, the 
expression within fetal tissue and other cellular sources marked by proliferating cells 
indicates this protein may play a role in the regulation of cellular division, and may 

15 show utility in the diagnosis, treatment, and/or prevention of developmental diseases 
and disorders, including cancer and other proliferative conditions. Representative 
uses are described in the "Hyperproliferative Disorders" and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental ti.ssues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation. 

20 Dysregulation of apoptosis can result in inappropriate suppression of cell 

death, as occurs in the development of some cancers, or in failure to cbntrol the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 

25 applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 

30 differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
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proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
5 interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
inmiunotherapy targets for the above listed tissues; - 

Many polynucleotide sequences, such as EST sequences, are publicly 
-available and accessible through sequence databases. Some of these sequences are 

10 related to SEQ ID NO:45 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

15 formula of a-b, where a is any integer between 1 to 5 12 of SEQ ID NO:45, b is an 
integer of 15 to 526. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:45, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 36 

20 This gene is expressed primarily in adipose tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, fat metabolism. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the metabolic system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

30 synovial fluid and spinal fluid ) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 159 as residues: Pro-96 to Ser-106. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in adipose tissue indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment of obesity and 
other metabolic and endocrine conditions or disorders. Furthermore, the protein 
product of this gene may show utility in ameliorating conditions which occur 
secondary to aberrant fatty-acid metabolism (e.g. aberrant myelin sheath 
development), either directly or indirectly. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor marker and/or immunotherapy targets 
for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:46 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1018 of SEQ ID NO:46, b is an 
integer of 15 to 1032, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:46. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 37 

This gene is expressed primarily in adult brain tissue, testes, placenta, kidney, 
infant and fetal tissue (e.g., liver, spleen, lung) and to a lesser extent in immune cells 
(e.g.T-cells and neutrophils) and in cancerous tissues (e.g.,ovarian tumor, Hodgekins 
lymphoma, pancreas, T-cell). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
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not limited to, CNS disorders, disorders of the testicles, cancer, particularly ovarian, 
pancreatic, T-cell, and Hodgekin's lymphoma. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
"differential identification of the tissue(s) or cell type(s). For a number of disorders of 
5 the above tissues or ceills, particularly of the brain,CNS. and testes expression of this 
gene at significandy higher or lower levels is routinely detected in cenain tissues or 
cell types (e.g., neural, urogenital, cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 

10 "expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in brain indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 

15 Representative uses are described in the "Regeneration" and "Hyperproliferative 

Disorders" sections below, in Example 11, 15, and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease, Parkinson's Disease, Huntington's Disease. Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 

20 trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene product 

25 in regions of the brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Moreover, the expression within fetal tissue and other cellular sources 
marked by proliferating cells indicates this protein may play a role in the regulation of 

30 cellular division, and may show utility in the diagnosis, treatment, and/or prevention 
of developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
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and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
5 death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 

10 also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
delecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 

15 degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. The tissue distribution 
indicates polynucleotides and polypeptides corresponding to this gene are useful for 

20 the diagnosis and treatment of a variety of immune system disorders. Representative 
uses are described in the "Immune Activity" and "infectious disease" sections below, 
in Example 11, 13, 14, 16, 18, 19. 20, and 27. and elsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation; 
survival; differentiation: and/or activation of hematopoietic ceil lineages, including 

25 blood stem cells. This gene product is involved in the regulation of cytokine 

production, antigen presentation, or other proce.sses suggesting a usefulness in the 
treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 

30 immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou s Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
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such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
5 Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
- in the differentiation and/or proliferation of various cell types. Additionally, the 

10 - tissue distribution in testes indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the treatment and diagnosis of conditions 
concerning proper testicular function (e.g. endocrine function, sperm maturation), as 
well as cancer. Therefore, this gene product is useful in the treatment of male 
infertility and/or impotence. This gene product is also useful in assays designed to 

15 identify binding agents, as such agents (antagonists) are useful as male contraceptive 
agents. Similarly, the protein is believed to be useful in the treatment and/or diagnosis 
of testicular cancer. The testes are also a site of active gene expression of transcripts 
that is expressed, particularly at low levels, in other tissues of the body. Therefore, 
..this gene product is expressed in other specific tissues or organs where it may play 

20 related functional roles in other processes, such as hematopoiesis, inflammation, bone 
formation, and kidney function, to name a few possible target indications. 
Furthermore, the protein may also be used to determine biological activity, raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions, in addition to its use as a nutritional supplement. 

25 Protein, as well as. antibodies directed against the protein may show utility as a tumor 
marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:47 and may have been publicly available prior to conception of 

30 the present invention. Preferably, such related polynucleotides are specifically 

excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
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more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 2666 of SEQ ID NO:47, b is an 
integer of 15 to 2680, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:47, and where b is greater than or equal to a + 14. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 38 

When tested against fibroblast cell lines, supemaiants removed from cells 
containing this gene activated the EGRl assay. Thus, it is likely that this gene 
activates fibroblast cells through a signal transduction pathway. Early growth 
10 response 1 (EGRl) is a promoter associated with certain genes that induces various 
tissues and cell types upon activation, leading the cells to undergo differentiation and 
proliferation. 

This gene is expressed primarily in endometrial stromal cells, endometrial 
tumors, keratinocytes, fetal tissue (e.g. liver, spleen) and to a lesser extent in 

15 endothelial cells and immune cells (e.g., T-cells). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, endometrial carcinoma and immune cells disorders. Similarly, 

20 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the female 
reproductive system, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., immune, cancerous and 

25 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in the endometrium indicates that polynucleotides and 

30 polypeptides corresponding to this gene are useful for treating female infertility. The 
protein product is likely involved in preparation of the endometrium of implantation 
and could be administered either topically or orally. Alternatively, this gene could be 
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iransfected in gene-replacement treatments into the cells of the endometrium and the 
protein products could be produced. Similarly, these treatments could be performed 
during artificial insemination for the purpose of increasing the likelyhood of 
-implantation and development of a healthy embryo. In both cases this gene or its gene 
5 product could be administered at later stages of pregnancy to promote heathy 
development of the endometrium. Additionally, polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of endometrial 
carcinoma. The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and 

10 other proliferative disorders. Expression within embryonic tissue and other cellular 
sources marked by proliferating cells indicates that this protein may play a role in the 
regulation of cellular division. Additionally, the expression in hematopoietic cells and 
tissues indicates that this protein may play a role in the proliferation, differentiation, 
and/or survival of hematopoietic cell lineages. In such an event, this gene is useful in 

15 the treatment of lymphoproliferative disorders, and in the maintenance and 

differentiation of various hematopoietic lineages from early hematopoietic stem and 
committed progenitor cells. Similarly, embryonic development also involves 
decisions involving cell differentiation and/or apoptosis in pattern formation. Thus 
this protein may also be involved in apoptosis or tissue differentiation and could again 

20 be useful in cancer therapy. The tissue distribution in immune cells such as helper T- 
cells indicates polynucleotides and polypeptides corresponding to this gene are useful 
for the diagnosis and treatment of a variety of immune system disorders. 
Representative uses are described in the "Immune Activity" and "infectious disease" 
sections below, in Example 1 1, 13, 14, 16, 18, 19, 20. and 27, and elsewhere herein. 

25 Briefly, the expression of this gene product indicates a role in regulating the 
proliferation; survival; differentiation; and/or activation of hematopoietic cell 
lineages, including blood stem cells. This gene product is involved in the regulation of 
cytokine production, antigen presentation, or other processes suggesting a usefulness 
in the treatment of cancer (e.g. by boosting immune responses ). 

30 Since the gene is expressed in cells of lymphoid origin, the natural gene 

product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis* asthma, immunodeficiency diseases such 
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as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-grafi and graft-versus-host diseases, or autoimmunity 
5 disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythemalosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein ihay represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 

10 the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. The tissue distribution 
in keratinocyies indicates polynucleotides and polypeptides corresponding to this 
gene are useful for the treatment, diagnosis, and/or prevention of various skin 
disorders. Representative uses are described in the "Biological Activity", 

15 "Hyperproliferative Disorders", "infectious disease", and "Regeneration" sections ' 
below, in Example 11, 19, and 20, and elsewhere herein. Briefly, the protein is useful 
in detecting, treating, and/or preventing congenital disorders (i.e. nevi, moles, 
freckles, Mongolian spots, hemangiomas, port-wine syndrome), integumentary 
tumors (i.e. keratoses, Bowen's Disease, basal cell carcinoma, squamous cell 

20 carcinoma, malignant melanoma, Paget's Disease, mycosis fungoides. and Kaposi's 
sarcoma), injuries and inflammation of the skin (i.e. wounds, rashes, prickly heat 
disorder, psoriasis, dermatitis), atherosclerosis, uticaria. eczema, photosensitivity, 
autoinmiune disorders (i.e. lupus erythemato.sus, vitiligo, dermatomyositis, morphea, 
scleroderma, pemphigoid, and pemphigus), keloids, striae, erythema, petechiae, 

25 purpura, and xanthelasma. In addition, such disorders may predispose increased 
susceptibility to viral and bacterial infections of the skin (i.e. cold sores, warts, 
chickenpox, molluscum contagiosum, herpes zoster, boils, cellulitis, erysipelas, 
impetigo, tinea, althletes foot, and ringworm). Moreover, the protein product of this 
gene may also be useful for the treatment or diagnosis of various connective tissue 

30 disorders (i.e., arthritis, trauma, tendonitis, chrondomalacia and inflammation, etc.), 
autoimmune disorders (i.e.. rheumatoid arthritis, lupus, scleroderma, 
dermatomyositis, etc.), dwarfism, spinal deformation, joint abnormalities, amd 
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chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita, familial 
osteoarthritis, Atelosteogenesis type IL metaphyseal chondrodysplasia type Schmid). 
Furthermore, the protein may also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
5 that modulate their interactions, in addition to its use as a nutritional supplement. 

Protein, as well as, antibodies directed against the protein may show utility as a tumor 
marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

iO - related to SEQ ID NO:48 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

15 formula of a-b, where a is any integer between 1 to 1716 of SEQ ID NO:48. b is an 
integer of 15 to 1730, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:48, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

20 This gene is expressed primarily in LNCAP cells (prostate cell line) and retina 

derived N2b5HR cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

25 not limited to, prostate cancer and eye disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissuef s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male urogenital and 
reproductive system expression of this gene at significantly higher or lower levels is 

30 routinely detected in certain tissues or ceil types (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
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to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 162 as residues: Asn-50 to Ser-57. Polynucleotides 
5 encoding said polypeptides are also provided. 

The expression in prostate may indicate the gene or its products can be used in the 
disorders of the prostate, including inflammatory disorders, such as chronic 
prostatitis, granulomatous prostatitis and malacoplakia, prostatic hyperplasia and 
prostate neoplastic disorders, including adenocarcinoma, transitional cell carcinomas, 

10 ductal carcinomas, squamous cell carcinomas, or as hormones or factors with 

systemic or reproductive functions. The tissue distribution in retina indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and/or detection of eye disorders including blindness, color blindness, 
impaired vision, short and long sightedness, retinitis pigmentosa, retinitis proliferans, 

15 and retinoblastoma, retinochoroiditis, retinopathy and reiinoschisis. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue . 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 

20 immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:49 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 

25 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between .1 to 1261 of SEQ ID NO:49, b is an 
integer of 15 to 1275, where both a and b correspond to the positions of nucleotide 

30 residues shown in SEQ ID NO:49, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
. the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: RCCCRGCSCRARLCPPARSTAVAPECRGAHPSR 
AMRPGTALQAVLLAVLLVGLRAATGRLLSGQPVCRGGTQRPCYKVIYFHD 
TSRRLNFEEAKEACRRGWRPASQHRVLKMNRN (SEQ ID NO: 315). 
Polynucleotides encoding these polypeptides are also provided. 

A preferred polypeptide fragment of the invention comprises the following 
amino acid sequence: MRPGTALQAVLLAVLLVGLRAATGRLLSGQPVCRGG 
TQRPCYKVIYFHDTSRRLNFEEAKEACRRGWRPASQHRVLKMNRN (SEQ ID 
NO: 316). Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in smooth muscle and human thyroid and to a 
lesser extent in amniotic cells and human endometrial stromal cells-treated with 
progesterone. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, thyroid disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the endocrine system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 163 as residues: Ser-75 to Leu-81. Polynucleotides 
encoding said polypeptides are also provided. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of endocrine 
disorders of the thyroid. 

Many polynucleotide sequences, such as EST sequences, are publicly 
5 available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:50 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
10 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1748 of SEQ ID NO:50, b is an 
integer of 15 to 1762, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 50, and where b is greater than or equal to a + 14. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene is expressed primarily in human testes tumor and bone marrow. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

20 not limited to, disorders of the testicles including but not limited to testicular 

cancerand immune system disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or ceil type(s). For a number of disorders of the above 
tissues or cells, particularly of the male reproductive system and immune system 

25 expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., reproductive, immune, cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative . 
to the standard gene expression level, i.e., the expression level in healthy tissue or 

30 bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 164 as residues: His-31 to Gly-41. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in testes, particularly testicular tumors, indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and diagnosis of conditions concerning proper testicular function (e.g. 
endocrine function, sperm maturation), as well as cancer. Therefore, this gene product 
is useful in the treatment of male infertility and/or impotence. This gene product is 
also useful in assays designed to identify binding agents, as such agents (antagonists) 
are useful as male contraceptive agents. Similarly, the protein is believed to be useful 
in the treatment and/or diagnosis of testicular cancer. The testes are also a site of 
active gene expression of transcripts that is expressed, particularly at low levels, in 
other tissues of the body. Therefore, this gene product is expressed in other specific 
tissues or organs where it may play related functional roles in other processes, such as 
hematopoiesis, inflammation, bone formation, and kidney function, to name a few 
possible target indications. The tissue distribution in bone marrow indicates 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of a variety of immune system disorders. Representative uses 
are described in the "Immune Activity" and "infectious disease" sections below, in 
Example 1 1, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation; 
survival; differentiation; and/or activation of hematopoietic cell lineages, including 
blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes suggesting a usefulness in the 
treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatous Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host di.sea.ses, or autoimmunity 
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disorders, such as autoimmune infertility, lease tissue injury, demyelination, systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 
protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
inrmiunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:51 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2045 of SEQ ID N0:51, b is an 
integer of 15 to 2059. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:5 1, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 42 

The translation product of this gene shares sequence homology with 
protocadherins, which are related to cadherin. and possess cell adhesive ability. 
Cadherins are glycosylated integral membrane proteins that are involved in cell-cell 
adhesion. 

This gene. is expressed primarily in brain (infant, adult frontal lobe, manic 
depression tissue) and to a lesser extent in epididymus. healing groin wounds, ovary, 
adipocytes, and fetal tissue (e.g.. kidney and retina). 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, neurodegenerative disorders, impaired male and female fertility, 
developmental disorders, fibrosis, and manic depression . Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the nervous system and 
reproductive system expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g.. neural, reproductive, cancerous 
and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid and 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 165 as residues: Val-35 to Lys-41, Ser-68 to Gln-73. 
Glu-88 to Glu-93. Arg-156 to Gly-163. Ala-199 to Gly-206. Asp-216 to Ser-226, Thr- 
249 to Asn-254, Asp-339 to Pro-345. Ile-370 to Gly-379. Pro-429 to Glu-434, Arg- 
461 to Pro-466. Ala-475 to Thr-482. Pro-585 to Gly-593. Glu-63 1 to Gln-639. Pro- 
674 to Pro-682. Gln-715 to Gly-720. Ser-736 to Arg-742. Polynucleotides encoding 
said polypeptides are also provided. 

BLAST analysis reveals high homology to protocadherin sequences. These 
sequences are related to cadherin. and possess cell adhesive ability. Such proteins 
may have regulatory functions in the cell, as well as the cell-cell adhesive properties. 
Antibodies produced against these sequences are useful for modulating the binding 
activity of these protocadherins. and can be used therapeutically. The tissue 
distribution in brain indicates polynucleotides and polypeptides corresponding to this 
gene are useful for the detection, treatment, and/or prevention of neurodegenerative 
disease states, behavioral disorders, or inflammatory conditions. Representative uses 
are described in the "Regeneration" and "Hyperproliferative Disorders" sections 
below, in Example 11,15, and 18. and elsewhere herein. Briefly, the uses include, but 
are not limited to the detection, treatment, and/or prevention of Alzheimer's Disease. 
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Parkinson's Disease. Huntington's Disease. Tourette Syndrome, meningitis, 
encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, trauma, 
congenital malformations, spinal cord injuries, ischemia and infarction, aneurysms, 
hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive compulsive 
5 disorder, depression, panic disorder, learning disabilities, ALS, psychoses, autism, 
and altered behaviors, including disorders in feeding, sleep patterns, balance, and 
perception. In addition, elevated expression of this gene product in regions of the 
brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
10 neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. The tissue distribution in epididymus indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
conditions concerning proper testicular function (e.g. endocrine function, sperm 
maturation), as well as cancer. Therefore, this gene product is useful in the treatment 
15 of male infertility and/or impotence. This gene product is also useful in assays 

designed to identify binding agents, as such agents (antagonists) are useful as male 
contraceptive agents. Siniilarly, the protein is believed to be useful in the treatment 
and/or diagnosis of testicular cancer. The testes are also a site of active gene 
expression of transcripts that is expressed, particularly at low levels, in other tissues 
of the body. Therefore, this gene product is expressed in other specific tissues or 
organs where it may play related functional roles in other processes, such as 
hematopoiesis, inflammation, bone formation, and kidney function, to name a few 
possible target indications. Moreover, the expression within fetal tissue (e.g., kidney 
and retina) and other cellular sources marked by proliferating cells indicates this 
protein may play a role in the regulation of cellular division, and may show utility in 
the diagnosis, trieatment. and/or prevention of developmental diseases and disorders, 
including blindness, cancer, and other proliferative conditions. Representative uses 
are described in the "Hyperproliferative Di.sorders " and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental ti.ssues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation, 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 



wo 00/06698 



114 



PCT/US99/17130 



of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:52 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 3268 of SEQ ID NO:52, b is an 
integer of 15 to 3282. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:52. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: 
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IRHEQQGEEDDEHARPL.\ESLLLAIADLLFCPDFTVQSHRRSTVDSAEDVHSL 

DSCEYIWEAGVGFAHSPQPNYIHDMNRMELLKLLLTCFSEAMYLPPAPESGS 

TNPWVQFFCSTENRHALPLFTSLLNTVCAYDPVGYGIPYNHLLFSDYREPLVE 

EAAQVLIVTLDHDSASSASPTVDGTTTGTAMDDADPPGPENLFVNYLSRIHRE 

EDFQFILKGIARLLSNPLLQTYLPNSTKKDPVPPGAASSLLEALRLQQEIPLLRA 

EEQRRPRHPCPHPLLPQRCPGRSV (SEQ ID NO: 317). Polynucleotides encoding 

such polypeptides are also provided. 

This gene is expressed primarily in brain, breast, breast cancer tissue and to a 
lesser extent in epididymus, amniotic cells, and embryo tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, neurodegenerative disorders, impaired CNS function, male sterihty, 
and breast cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above ' 
tissues or cells, particularly of the nervous and reproductive systems, expression of 
this gene at significantly higher or lower levels is routinely detected in certain tissues 
or cell types (e.g., neural, male reproductive, cancerous and wounded tissues) or 
bodily fluids (e.g., amniotic, serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 166 as residues: Pro-22 to Pro-3 1, SerOS to His-43, 
Asp-74 to Leu-79, Asp- 1 13 to Glu- 121, Leu- 157 to Val-166, Ala- 189 to Arg-196, 
Gln-206 to Arg-21 1. Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in brain, particulary in the cerebellum, indicates 
polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of neurodegenerative disease states, behavioral 
disorders, or inflammatory conditions. Representative uses are described in the 
"Regeneration" and "Hyperproliferativc Disorders" sections below, in Example 1 1, 
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15, and 18. and elsewhere herein. Briefly, the uses include, but are not limited to the 
detection, treatment, and/or prevention of Alzheimer's Disease. Parkinson's Disease, 
Huntington's Disease, Touretie Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder depression, panic disorder, 
: learning disabilities, ALS, psychoses, autism, and filtered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. The tissue distribution in epididymus indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
conditions concerning proper testicular function (e.g. endocrine function, sperm 
maturation), as well as cancer. Therefore, this gene product is useful in the treatment 
of male infertility and/or impotence. This gene product is also useful in assays 
designed to identify binding agents, as such agents (antagonists) are useful as male 
.contraceptive agents. Similarly, the protein is believed to be useful in the treatment 
and/or diagnosis of testicular cancer. The testes are also u site of active gene 
expression of transcripts that is expressed, particularly at low levels, in other tissues 
of the body. Therefore, this gene product is expressed in other specific tissues or 
organs where it may play related functional roles in other processes, such as 
hematopoiesis. inflanunalion. bone formation, and kidney function, to name a few 
possible target indications. The expression in the breast tissue may indicate its uses in 
the diagnosis and/or treatment of breast neoplasia and breast cancers, such as 
fibroadenoma, pipillary carcinoma, ductal carcinoma. Paget s Disease, medullary 
carcinoma, mucinous carcinoma, tubular carcinoma, secretory carcinoma and 
apocrme carcinoma, as well as juvenile hypertrophy and gynecomastia, mastitis and 
abscess, duct ectasia, fat necrosis and fibrocystic diseases. Moreover, the expression 
withm embryonic tissue and other cellular sources marked by proliferating cells 
indicates this protein may play a role in the regulation of cellular division, and may 
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show utility in the diagnosis, treatment, and/or prevention of developmental diseases 
and disorders, including cancer, and other proliferative conditions. Representative 
uses are described in the "Hyperproliferaiive Disorders" and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental tissues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its u.se as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
inmiunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databa.ses. Some of these sequences are 
related to SEQ ID NO:53 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between 1 to 1846 of SEQ ID NO:53, b is an 
integer of 15 to 1860. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:53, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY QENE NO: 44 

Contact of cells with supernatant expressing the product of this gene increases 
: . the permeability of monocytes to calcium. Thus, it is likely that the product of this 
gene is involved in a signal. transduction pathway that is initiated when the product of 
this gene binds a receptor on the surface of the monocyte cell. Thus, polynucleotides 
and polypeptides have uses which include, but are not limited to, activating monocyte 
cells. 

This gene is expressed primarily in CD34 positive cells derived from human 
cord blood. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, hematopoietic disorders; immune dysfunction: defects in hematopoietic 
stem and progenitor cells; susceptibility to chemotherapy and irradiation. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., immune, cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e.. the expression level in healthy tissue or bodily fluid from 
an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 167 as residues: Ala-38 to Leu-59, Ala-63 to Thr-7I, 
Lys-82 to Leu*91, Glu-97 to Ser-107. Gln-143 to Ala-149, Ile-153 to Leu-158. Ser- 
169 to Arg-182. Polynucleotides encoding said polypeptides are also provided. 
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Elevated expression of this gene product in CD34 positive hematopoietic cells 
indicates that it is expressed by early stem and progenitor cells of the hematopoietic 
lineages. Therefore, this may represent a soluble factor that is able to control the 
survival, proliferation, differentiation, or activation of all hematopoietic lineages, 
including stem and progenitor cells. Thus, it could be quite useful, for example, in ex 
vivo expansion of stem cell numbers for hematopoietic disorders or for cancer 
patients. Alternately, it may represent a factor that influences the hematopoietic 
microenvironment by affecting stromal cells that release other factors required for 
hematopoietic development. Additionally, the tissue distribution in CD34 positive 
cells also indicates polynucleotides and polypeptides corresponding to this gene are 
useful for the treatment and diagnosis of hematopoietic related disorders such as 
anemia, pancytopenia, leukopenia, thrombocytopenia or leukemia since stromal cells 
are important in the production of cells of hematopoietic lineages. Representative uses 
are described in the "Immune Activity" and "infectious disease" sections below, in 
Example 11, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the uses 
include bone marrow cell ex-vivo culture, bone marrow transplantation, bone marrow 
reconstituiion, radiotherapy or chemotherapy of neoplasia. 

The gene product may also be involved in lymphopoiesis, therefore, it can be 
used in immune disorders such as infection, inflammation, allergy, immunodeficiency 
etc. In addition, this gene product may have commercial utility in the expansion of 
stem cells and committed progenitors of various blood lineages, and in the 
differentiation and/or proliferation of various cell types. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:54 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 756 of SEQ ID NO:54, b is an 
integer of 15 to 770. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:54, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

This gene is expressed primarily in breast and ! 2-week old human embryos 
and to a lesser extent in stomach cancer and liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, breast cancer: stomach cancer: embryonic defects: hepatic disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, panicularly of the 
digestive and endocrine systems, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are 
useful for the diagnosis and/or treatment of a variety of disorders. Elevated expression 
, of this gene product in stomach cancer indicates it is useful as a marker or therapeutic 
target for stomach cancer. Alternately, expression in breast tissue is influenced by the 
presence or absence of breast cancer tissue, and may thus also serve as a diagnostic 
marker for this cancer as well. E.xpression in the developing embr>'o may correlate 
with the normal development of human embryhos. and expression in the liver is 
involved in the regulation of normal liver function and/or liver regeneration. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:55 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related .sequence is 
cumbersome. Accordingly, preferably excluded from the pre.sent invention are one or 
more polynucleotides comprising a nucleotide .sequence described by the general 
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formula of a-b, where a is any integer between i to 1079 of SEQ ID NO:55, b is an 
integer of 15 to 1093. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:55, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed primarily in human hypothalamus derived from a 
patient with schizophrenia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, schizophrenia: neurological disorders: impaired nervous system 
function. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous system, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g.. neural, 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 169 as residues: Glu-34 to Trp-39. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in brain, particularly in the hypothalamus, indicates 
polynucleotides and polypeptides corresponding to this gene are useful for the 
detection, treatment, and/or prevention of neurodegenerative disease states, behavioral 
disorders, or inflammator>' conditions. Representative uses are described in the 
"Regeneration*' and "Hyperproliferative Disorders" sections below, in Example 1 1, 
15, and 18, and elsewhere herein. Brietly, the uses include, but are not limited to the 
detection, treatment, and/or prevention of Alzheimer's Disease. Parkinson's Disease, 
Huntington's Disease. Touretie Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
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cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities, ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:56 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 618 of SEQ ID NO:56, b is an 
integer of 15 to 632, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:56, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 47 

The translation product of this gene shares sequence homology with human 
lecithin-cholesterol acyltransferase (LCAT), which catalyses the transfer of fatty acid 
from the sn-2 position of lecithin to the free hydroxy] group of cholesterol. Preferred 
polypeptides of the invention comprise the following amino acid sequence: RLVYN 
KTSRATQFPDGVDVRVPGFGKTFSLEFLDPSKSSVGSYFHTMVESLVGWGYT 
RGEDVRGAPYDWRRAPNENGPYFLALREMIEEMYQLYGGPVVLVAHSMGN 
MYTLYFLQRQPQAWKDKYIRAFVSLGAPWGGVAKTLRVLASGDNNRIPVIG 
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PLKIREQQRSAVSTSWLLPYNYTWSPEKVFVQTPTINYTLRDYRKFFQDIGFE 
DGWLiMRQDTEGLVEATMPPGVQLHCLYGTGVPTPDSFYYESFPDRDPKICFG 
DGDGTVNLKSALQCQAWQSRQEHQVLLQELPGSEHIEMLANATTLAYLKRV 
LLGP (SEQ ID NO: 318). Polynucleotides encoding such polypeptides are also 
provided. 

This gene is expressed primarily in osteoblasts & dendritic cells and to a lesser 
extent in muscle and other hematopoietic cell lineages. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, hematopoietic disorders: immune dysfunction; osteoporosis: 
osteopetrosis; muscle degeneration. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the skeletal and immune systems, expression of this 
gene at significantly higher or lower levels is routinely detected in cenain tissues or " 
cell types (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 170 as residues: Cys-65 to Ser-71. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution and homology to lecithin-cholesterol acyltransferase 
(LCAT) indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the diagnosis and/or treatment of a variety of disorders. For example, 
artheroscelerosis is a pathological condition of mammals characterised by the 
accumulation of cholesterol in the arteries, which leads to hean di.sease, strokes, heart 
attacks and peripheral vascular disease. The enzyme could be used in a novel method 
of treating atherosclerosis, which involves increasing the level of LCAT activity, 
which then causes a decrease in the accumulation of cholesterol. The method and the 
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products can be used for the prophylaxis and treatment of atherosclerosis, and 
associated heart disease, myocardial infarction, stroke and peripheral vascular disease, 
as well as individuals suffering from Fish Eye Syndrome (caused by LCAT 
deficiency) or Classic LCAT Deficiency Syndroine. Alternately, elevated expression 
of this gene product in osteoblasts and hematopoietic cell lineages indicates that it 
may play additional roles in bone tumover, regulation of immune system function, 
and muscular function. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
. related to SEQ ID NO:57 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 2673 of SEQ ID NO:57. b is an 
integer of 15 to 2687, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:57, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 

When tested against HELA epithelial cell lines, supemaiants removed from 
cells containing this gene activated the GAS assay. Thus, it is likely that this gene 
activates epithelial cells through the Jak-STAT signal transduction pathway. The 
gamma activating sequence (G.A.S) is a promoter element found upstream of many 
genes which are involved in the Jak-STAT pathway. The Jak-STAT pathway is a 
Iarge» signal transduction pathway involved in the differentiation and proliferation of 
cells. Therefore, activation of the Jak-STAT pathway, reflected by the binding of the 
GAS element, can be used to indicate proteins involved in the proliferation and 
differentiation of cells. 

This gene is expressed primarily in adult brain, infant brain, fibroblasts, 
embryonic and fetal tissue (e.g.. spleen, liver), placenta and to a lesser extent in 
endocrine organs, cancerous colon and breast. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, dementia, epilepsy, schizophrenia, and developmental abnormalities. 
Similarly » polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
neural system, endocrine system, and during development, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in brain indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of- 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 
Representative uses are described in the "Regeneration" and "Hyperproliferative 
Disorders" sections below, in Example 11,15, and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease, Parkinson's Disease, Huntington's Disease, Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. In addition, the expression of this gene product in synovium (synovial 
sarcoma) would suggest a role in the detection and treatment of disorders and 



wo 00/06698 



126 



PCT/LIS99/17130 



conditions afflicting the skeletal system, in particular osteoporosis, bone cancer, 
connective tissue disorders (e.g. arthritis, trauma, tendonitis, chrondomalacia and 
inflammation). The protein is also useful in the diagnosis or treatment of various 
autoimmune disorders (i.e., rheumatoid arthritis, lupus, scleroderma, and 
dermatomyositis), dwarfism, spinal deformation Joint abnormalities, and 
chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita, familial 
osteoarthritis, Atelosteogenesis type 11. metaphyseal chondrodysplasia type Schmid, 
:etc.). The tissue distribution in endocrine tissues indicates polynucleotides and 
^polypeptides corresponding to this gene are useful for the detection, treatment, and/or 
prevention of various endocrine disorders and cancers. Representative uses are 
described in the "Biological Activity". "Hyperproliferative Disorders", and "Binding 
Activity" sections below, in Example 11, 17, 18. 19. 20 and 27, and elsewhere herein. 
Briefly, the protein can be used for the detection, treatment, and/or prevention of 
Addison's Disease, Cushing's Syndrome, and disorders and/or cancers of the 
pancrease (e.g. diabetes mellitus), adrenal cortex, ovaries, pituitary (e.g., hyper-, 
hypopituitarism), thyroid (e:g. hyper-, hypothyroidism), parathyroid (e.g. hyper- 
,hypoparathyroidism) , hypothallamus, and testes. Additionally, the expression within 
fetal tissue, cancerous colon and breast, and other cellular sources marked by 
proliferating cells indicates this protein may play a role in the regulation of cellular 
division, and may show utility in the diagnosis, treatment, and/or prevention of 
developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SiVIA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may . 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
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polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or dssue 
differentiation and is useful in the detection, treatment, and/or prevention of 
5 degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 

10 markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 

15 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:58 and may have been publicly available prior to conception of ^• 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention; To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or" 

20 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 605 of SEQ ID NO:58, b is an 
integer of 15 to 619, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:58; and where b is greater than or equal to a + 14. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

Preferred polypeptides of the invention comprise the following amino acid 
sequence or a subfragment thereof: MNKEDKVWNDCKGVNKLTNLEEQYIILIFQ 
NGLDPPANMVFESIINEIGIKNNISNFFAKIPFEEANGRLVACTRTYEESIKGSC 

GQKENKIKTVSFESKIQLRSKQEFQFFDEEEETGENHTIFIGPVEKLIVYPPPPA 
30 KGGISVTNEDLHCLNEGEFLNDVIIDFYLKYLVLEKLKKEDADRIHIFSSFFYK 
RLNQRERRNHETTNLSIQQKRHGRVKTWTRHVDIFEKDFIFVPLNEAAHWFL 
AVVCFPGLEKPKYEPNPHYHENAVIQKCSTVEDSCISSSASEMESCSQNSSAK 
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PVIKmLNKKHCIAVIDSNPGQEESDPRYKRNlCSVKYSVKKlKHTASENEEF 
NKGESTSQKS (SEQ ID NO: 319). Polynucleotides encoding such polypeptides are 
also provided. 

This gene is expressed primarily in fetal tissue, stomach, brain, endometrial 
5 cells, and bone and to a lesser extent in prostate, retina, adipocytes, smooth muscle, 
and tumors of the endometrium, ovaries, and parathyroid. 

" Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
• biological sample and for diagnosis of diseases and conditions which include, but are 

10 'hot limited to, disorders of the endocrine system, ulcers, stomach cancer, epilepsy, 
schizophrenia, dementia, bone growth, developmental disorders and resorption. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

15 digestive system and neural systems expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., neural, 
endocrine system, cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 

20 level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

The tissue distribution in brain indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 

25 Representative uses are described in the "Regeneration" and "Hyperproliferalive 

Disorders" sections below, in Example 11, 15. and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease, Parkinson's Disease. Huntington's Disease. Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 

30 trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
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psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 
Potentially, this gene product is involved in synapse formation. 
5 neurotransmission, learning, cognition, horheostasis, or neuronal differentiation or 
survival. Expression of this gene product in stomach tissue indicates involvement in 
digestion, processing, and elimination of food, as well as a potential role for this gene 
as a diagnostic marker or causative agent in the development of stomach cancer, and 
cancer in general. The expression within embryonic, fetal tissue and other cellular 

10 sources marked by proliferating cells indicates this protein may play a role in the 
regulation of cellular division, and may show utility in the diagnosis, treatment, 
and/or prevention of developmental diseases and disorders, including cancer, and 
other proliferative conditions. Representative uses are described in the 
"Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere 

15 herein. Briefly, developmental tissues rely on decisions involving cell differentiation . 
and/or apoptosis in pauem formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 

20 neurodegenerative disorders, such as spinal muscular atrophy (SMA), Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tis.sue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 

25 detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 

30 proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. The tissue distribution 
in parathyroid tumor indicates polynucleotides and polypeptides corresponding to this 
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gene are useful for the detection, treatment, and/or prevention of various endocrine 
disorders and cancers. Representative uses are described in the "Biological Activity", 
"Hyperproliferative Disorders", and "Binding Activity" sections below, in Example 
1 1, 17, 18, 19, 20 and 27, and elsewhere herein. Briefly, the protein can be used for 
5 the detection, treatment, and/or prevention of Addison's Disease. Cushing's 

Syndrome, and disorders and/or cancers of the pancrease (e.g. diabetes mellitus), 
adrenal cortex, ovaries, pituitary (e.g., hyper-, hypopituitarism), thyroid (e.g. hyper-, 
hypothyroidism), parathyroid (e.g. hyper-,hypoparathyroidism) , hypothallamus, and 
testes. The tissue distribution in testes indicates that polynucleotides and 

10 polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
conditions concerning proper testicular function (e.g. endocrine function, sperm 
maturation), as well as cancer. Therefore, this gene product is useful in the treatment 
of male infertility and/or impotence. This gene product is also useful in assays 
designed to identify binding agents, as such agents (antagonists) are useful as male 

15 contraceptive agents. Similarly, the protein is believed to be useful in the treatment 
and/or diagnosis of testicular cancer. The testes are also a site of active gene 
expression of transcripts that is expressed, particularly ai low levels, in other tissues 
of the body. Furthermore, the protein may also be used to determine biological 
activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 

20 identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may .show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

25 related to SEQ ID NO:59 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever>' related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

30 formula of a-b. where a is any integer between 1 to 1364 of SEQ ID NO:59, b is an 
integer of 15 to 1378, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:59, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

The translation product of this gene shares good protein homology with 
Xenopus NaDC-2 gene and a rabbit renal sodium/dicarboxylate cotransponer. The 
translation product of this gene also shares good homology with a rat placental protein 
which is a sodium-coupled high affinity dicarboxylate transporter. Therefore, it is 
likely that that the translated product encoded by this gene shares similar biological 
activity. 

This gene is expressed primarily in the placenta and colon adenocarcinoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental abnormalities as well as failure to thrive anomalies. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, panicularly of the 
female reproductive system and colon, expression of this gene at significantly higher 
or lower levels is routinely detected in certain tissues or cell types (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., amniotic, serum, plasma, urine, synovial fluid 
and spinal fluid) or another tissue or cell sample taken from an individual having such 
a disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 173 as residues: Lys-166 to Gly-181, Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in human placenta and the shared homology of this 
translation product to a rat placental protein indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the diagnosis and/or treatment 
of disorders of the placenta. Specific expression within the placenta indicates that this 
gene product may play a role in the proper establishment and maintenance of 
placental function. Alternately, this gene product is produced by the placenta and then 
transported to the embryo, where it may play a crucial role in the development and/or 
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survival of the developing embryo or fetus. Expression of this gene product in a 
vascular-rich tissue such as the placenta also indicates that this gene product is 
produced more generally in endothelial cells or within the circulation. In such 
instances, it may play more generalized roles in vascular function, such as in 
angiogenesis. It may also be produced in the vasculature and have effects on other 
cells within the circulation, such as hematopoietic cells. It may serve to promote the 
proliferation, survival, activation, and/or differentiation of hematopoietic cells, as well 
as other cells throughout the body. The tissue distribution in colon tissue indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and/or treatment of disorders involving the colon. Expression of this gene 
product in colon tissue indicates involvement in digestion, processing, and 
elimination of food, as well as a potential role for this gene as a diagnostic marker or 
causative agent in the development of colon cancer, and cancer in general. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:60 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1 1 12 of SEQ ID NO:60. b is an 
integer of 15 to 1 126. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:60. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

This gene is expressed primarily in the spinal cord. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. paralysis, neurologic disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
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the above tissues or cells, particularly of the nervous system, expression of this gene 
at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g., neurah cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
5 from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

The tissue distribution in spinal cord indicates polynucleotides and 
polypeptides corresponding to this gene are useful for the detection, treatment, and/or 

10 prevention of neurodegenerative disease states, behavioral disorders, or inflammatory 
conditions. Representative uses are described in the "Regeneration" and 
"Hyperproliferative Disorders" sections below, in Example I L 15, and 18. and 
elsewhere herein. Briefly, the uses include, but are not limited to the detection, 
treatment, and/or prevention of Alzheimer's Disease, Parkinson's Disease. 

15 Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 

diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities, ALS. psychoses, autism, and altered behaviors, including 

20 disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 

25 survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

30 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID N0:61 and may have been publicly available prior to conception of 



wo 00/06698 



134 



PCTAJS99/17130 



the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
5 formula of a-b, where a is any integer between 1 to 2064 of SEQ ID N0:6I, b is an 
integer of 15 to 2078, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:6K and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

10 - This gene is expressed primarily in keratinocytes, brain, fetal tissues, 

pericardium, stomach, and cancerous tissues (e.g., stomach, adrenals, parathyroid, 
germ cell, colon, breast). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 

15 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, skin disorders, neurodegenerative and developmental disorders, heart 
disease, and cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell lype(s). For a number of disorders of the above 

20 tissues or cells, particularly of the cardiovascular and gastrointestinal systems, 

expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g.. neural, immune, cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

25 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in brain indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 

30 Representative uses are described in the "Regeneration" and "Hyperproliferative 

Disorders" sections below, in Example 11, 15. and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
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Alzheimer's Disease, Parkinson s Disease, Huntington's Disease, Tourette Syndrome, 
meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
5 compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 
patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 
Potentially, this gene product is involved in synapse formation, 

10 neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. The tissue distribution in keratinocyies indicates polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment, diagnosis, and/or 
prevention of various skin disorders. Representative uses are described in the 
"Biological Activity", "Hyperproliferative Disorders", "infectious disease", and 

15 "Regeneration" sections below, in Example 11, 19, and 20, and elsewhere herein. 
Briefly, the protein is useful in detecting, treating, and/or preventing congenital 
disorders (i.e. nevi, moles, freckles, Mongolian spots, hemangiomas, port-wine 
syndrome), integumentary tumors (i.e. keratoses, Bowen's Disease, basal cell 
carcinoma, squamous cell carcinoma, malignant melanoma, Paget's Disease, mycosis 

20 fungoides, and Kaposi's sarcoma), injuries and inflammation of the skin (i.e. wounds, 
rashes, prickly heal disorder, psoriasis, dermatitis), atherosclerosis, uticaria. eczema, 
photosensitivity, autoimmune disorders (i.e. lupus erythematosus, vitiligo, 
dermatomyositis, morphea, scleroderma, pemphigoid, and pemphigus), keloids, striae, 
erythema, petechiae, purpura, and xanthelasma. In addition, such disorders may 

25 predispose increased susceptibility to viral and bacterial infections of the skin (i.e. 

cold sores, warts, chickenpox, molluscum contagiosum, herpes zoster, boils, cellulitis, 
erysipelas, impetigo, tinea, althletes foot, and ringworm). Moreover, the protein 
product of this gene may also be useful for the treatment or diagnosis of various 
connective tissue disorders (i.e., arthritis, trauma, tendonitis, chrondomalacia and 

30 inflammation, etc.), autoimmune disorders (i.e.. rheumatoid arthritis, lupus, 
scleroderma, dermatomyositis. etc.), dwarfism, spinal deformation, joint 
abnormalities, amd chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita. 
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familial osteoarthritis. Atelosteogenesis type II. metaphyseal chondrodysplasia type 
Schmid). The expression within fetal tissue (e.g., spleen and liver) and other cellular 
sources marked by proliferating cells indicates this protein may play a role in the 
regulation of cellular division, and may show utility in the diagnosis, treatment, 
and/or prevention of developmental diseases and disorders, including cancer, and 
other proliferative conditions. Representative uses are described in the 
IJHyperproliferative Disorders" and "Regeneration" sections below and elsewhere 
herein. Briefly, developmental tissues rely on decisions involving cell differentiation 
and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is u.seful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Additionally, the 
tissue distribution in the pericardium of the heart indicates that the protein is useful in 
the detection, treatment, and/or prevention of a variety of vascular disorders and 
conditions, which include, but are not limited to miscrovascular disease, vascular leak 
syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery disease, 
arteriosclerosis, and/or atherosclerosis. Furthermore, the protein may also be used to 
determine biological activity, to raise antibodies, as tissue markers, to isolate cognate 
ligands or receptors, to identify agents that modulate their interactions, in addition to 
its use as a nutritional supplement. Protein, as well as, antibodies directed against the 



wo 00/06698 



137 



PCT/US99/17130 



protein may show utility as a tumor marker and/or immunotherapy targets for the 
above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:62 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 748 of SEQ ID NO:62, b is an 
integer of 15 to 762, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:62, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

This gene is expressed primarily in the brain and in cartilage and to a lesser 
extent in the retina, activated T-cells, pineal gland, the lungs, and in synovial 
sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, neurological diseases, such as epilepsy and dementia, osteoarthritis, 
retinopathies, hematopoietic diseases, emphysema, and lung cancer . Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
inmiunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neurologic 
system, cartilage and musculature, vision, the hematopoietic system, and the 
pulmonary system expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., neural, immune, cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 176 as residues: Arg-34 to Cys-44. Polynucleotides 
.encoding said polypeptides are also provided. 

The tissue distribution in brain indicates polynucleotides and polypeptides 
5 corresponding to this gene are useful for the detection, treatment, and/or prevention of 
neurodegenerative disease states, behavioral disorders, or inflammatory conditions. 
^Representative uses are described in the "Regeneration" and "Hyperproliferative 
.Disorders" sections below, in Example 11, 15, and 18, and elsewhere herein. Briefly, 
. ihe uses include, but are not limited to the detection, treatment, and/or prevention of 

10 "Alzheimer's Disease. Parkinson s Disease. Huntington's Disease, Tourette Syndrome, 
meningitis, encephalitis, demyelinaiing diseases, peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, ischemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 

15 psychoses, autism, and altered behaviors, including disorders in feeding, sleep 

patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 

20 .survival. The tissue distribution in T-cells indicates polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of a variety of 
immune system disorders. Representative uses are described in the "Immune 
Activity" and "infectiousdisea.se" .sections below, in Example 1 1. 13, 14, 16, 18, 19, 
20, and 27, and elsewhere herein. Briefly, the expression of this gene product 

25 indicates a role in regulating the proliferation; survival: differentiation; and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

30 ^ Since the gene is expressed in cells of lymphoid origin, the natural gene 

product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including anhritis, asthma, immunodeficiency diseases such 
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as AIDS, leukemia, rheumatoid arthritis, granulomatous Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid anhritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. The expression of this 
gene product in synovium would suggest a role in the detection and treatment of 
disorders and conditions afflicting the skeletal system, in particular osteoporosis, bone 
cancer, connective tissue disorders (e.g. arthritis, trauma, tendonitis, chrondomalacia 
and inflammation). The protein is also useful in the diagnosis or treatment of various 
autoimmune disorders (i.e., rheumatoid arthritis, lupus, scleroderma, and • 
dermatomyositis), dwarfism, spinal deformation, joint abnormalities, and 
chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita, familial 
osteoarthritis, Atelosteogenesis type IL metaphyseal chondrodysplasia type Schmid, 
etc.). Additionally, the expression within fetal tissue and other cellular sources 
marked by proliferating cells indicates this protein may play a role in the regulation of 
cellular division, and may show utility in the diagnosis, treatment, and/or prevention 
of developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders** 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 



wo 00/06698 



140 



PCT/US99/17I30 



applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
-detecting, and/or preventing said disorders and conditions, in addition to other types 
5 of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
'^degenerative or proliferative conditions and diseases. The protein is useful in 
- modulating the immune response to aberrant polypeptides, as may exist in 

• proliferating and cancerous cells and tissues. The protein can also be used to gain new 
10 insight into the regulation of cellular growth and proliferation. Furthermore, the 

protein may also be used to determine biological activity, to raise antibodies, as tissue 

• markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 

15 '"immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:63 and may have been publicly available prior to conception of 
•the present invention. Preferably, such related polynucleotides are specifically 

20 -excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 1080 of SEQ ID NO:63, b is an 
integer of 15 to 1094, where both a and b correspond to the positions of nucleotide 

25 residues shown in SEQ ID NO:63. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

This gene is expressed primarily in umbilical vein endothelial ceils induced by 

IL-4. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type{s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
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not limited to, angiogenesis, inflammatory disorders, hematopoietic disease. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue{s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
angiogenic and hematopoietic systems, expression of this gene at significantly higher 
or lower levels is routinely detected in certain tissues or cell types (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disordei-, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in endothelial cells indicates polynucleotides and 
polypeptides corresponding to this gene are useful in the detection, treatment, and/or 
prevention of vascular conditions, which include, but are not limited to, microvascular 
disease, vascular leak syndrome, aneurysm, stroke, atherosclerosis, arteriosclerosis, or 
embolism. For example, this gene product may represent a soluble factor produced by 
smooth muscle that regulates the innervation of organs or regulates the survival of 
neighboring neurons. Likewise, it is involved in controlling the digestive process, and 
such actions as peristalsis. Similarly, it is involved in controlling the vasculature in 
areas where smooth muscle surrounds the endothelium of blood vessels. Funhermore, 
the protein may also be used to determine biological activity, to raise antibodies, as 
tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate 
their interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. The secreted protein can also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 
cognate ligands or receptors, to identify agents that modulate their interactions, and as 
nutritional supplements. It may also have a very wide range of biological activities. 
Representative uses are described in the "Chemotaxis" and "Binding Activity" 
sections below, in Examples 1 1. 12, 13, 14, 15, 16, 18. 19. and 20, and elsewhere 
herein. Briefly, the protein may possess the following activities: cytokine, cell 
proliferation/differentiation modulating activity or induction of other cytokines; 
immunostimulating/immunosuppressant activities (e.g. for treating human 
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immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 
regulation of hematopoiesis (e.g. for treating anemia or as adjunct to chemotherapy); 
stimulation or growth of bone, cartilage, tendons, ligaments and/or nerves (e.g. for 
treating wounds, stimulation of follicle stimulating hormone (for control of fertility); 
chemotactic and chemokinetic activities (e.g. for treating infections, tumors); 
hemostatic or thrombolytic activity (e.g. for treating hemophilia, cardiac infarction 
i:etc.); anti-inflammatory activity (e.g. for treating septic shock, Crohn's Disease); as 
. antimicrobials: for treating psoriasis or other hyperproliferative diseases; for 
.regulation of metabolism, and behavior. Also contemplated is the use of the 
corresponding nucleic acid in gene therapy procedures. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:64 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1347 of SEQ ID NO:64, b is an 
:integer of 15 to 1361. where both a and b correspond to the positions of nucleotide 
!fTesidues shown in SEQ ID NO:64. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 55 

This gene is expressed primarily in both normal and cancerous pancreas. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue{s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. diabetes, gastrointestinal disorders, and pancreatic cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the digestive and 
blood systems, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., cancerous and wounded 
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tissues) or bodily fluids (e.g.. seaim. plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e.. the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 
5 The tissue distribution in pancreas indicates that the protein products of this 

gene are u.seful as a therapeutic and/or diagnostic agent for pancreatic disorders and 
disorders of the endocrine and exocrine system, including but not limited to diabetes, 
blood disorders, pancreatic cancer, gastrointestinal diseases, hormomal imbalance, 
autoimmune disorders, cystic fibrosis, pancreatitis, and gallstones. Furthermore, the 

10 protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

15 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:65 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

20 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 933 of SEQ ID NO:65. b is an 
integer of 15 to 947, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:65, and where b is greater than or equal to a + 14. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 56 

The translation product of this gene shares sequence homology with 
oxidoreductase. Preferred polypeptides of the invention comprise the following amino 
acid sequence: MSPLSAARAALRVYAVGAAVILAQLLRRCRGGFLEPVXPPRP 
30 DRVAIVTGGTDGIGYSTANrW'RDLGMHVIIAGNNDSKAKQVVSKIKEETLND 
KVEFLYCDLASMTSIRQFVQKFKMKKIPLHVLINNAGVMMVPQRKTRDGFEE 
HFGLNYLGHFLLTNLLLDTLKESGSPGHSARVVTVSSATHYVAELNMDDLQS 
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SACYSPHAAYAQSKLALVLFTYHLQRLLAAEGSHVTANVVDPGVVNTDXYK 
HVFWATRLAKKLLGWLLFKTPDEGAWTSIYAAVTPELEGVGGRYLYNEKET 
KSLHVTYNQKLQQQLWSKSCEMTGVLDVTL (SEQ ID NO: 320). The mature 
form of this protein begins at residue 32. Thus, polypeptides comprising residues 2- 
330 and 32-330 of the sequence shown above are also provided. Polynucleotides 
encoding such polypeptides are also provided. 

A preferred polypeptide fragment of the invention comprises the following 
amino acid sequence: MSPLSAARAALRVYAVGAAVILAQLLRRCRGGFLEP 
VXPPRPDRVAIVTGGTDGIG YSTANIWRDLACMLS (SEQ ID NO: 321). 
Polynucleotides encoding these polypeptides are also provided. 

This gene is expres.sed primarily in breast cancer cells, osteoclastoma, wilm's 
tumor* thymus stromal cells, and T cell helper I. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer, e.g., breast cancer, osteoclastoma, and wilm's tumor. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g.. reproductive, kidney, immune, hematopoietic, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, breast milk, 
urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in breast cancer tissue, combined with the homology to 
oxidoreductase indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for diagnosis and treatment of cancer, particularly, breast cancer, 
osteoclastoma, and wilm's tumor. This protein may play a role in the regulation of 
cellular division, and may show utility in the diagnosis, treatment, and/or prevention 
of developmental di.seases and disorders, including cancer, and other proliferative * 
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conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. 

5 Dysregulation of apoptosis can result in inappropriate suppression of cell 

death, as occurs in the developnient of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 

10 applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 

15 differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 

20 protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its u.se as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

25 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:66 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list ever>' related sequence is 

30 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1362 of SEQ ID NO:66, b is an 
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integer of 15 to 1376, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:66, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 57 
5 This gene is expressed primarily in monocytes, T cell helper II and B cell 

lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

10 not limited to, immune and hematopoietic diseases and/or disorders, particularly B- 
cell lymphoma. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 

15 lower levels is routinely detected in certain tissues or cell types (e.g., immune, 
-hematopoietic, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 

20 having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 180 as residues: Asp-30 to Val-40. Polynucleotides 
encoding said polypeptides are also provided. 

The ussue distribution in monocytes, T cell helper, and B cell lymphoma cells 

25 indicates that polynucleotides and polypeptides corresponding to this gene are useful 
for diagnosis and treatment of B cell lymphoma. Representative uses are described in 
the "Immune Activity" and "infectious disease" sections below, in Example 11, 13, 
14, 16, 18, 19, 20, and 27. and elsewhere herein. Briefly, the expression of this gene 
product indicates a role in regulating the proliferation: survival; differentiation: and/or 

30 activation of hematopoietic cell lineages, including blood stem cells. This gene 

product is involved in the regulation of cytokine production, antigen presentation, or 
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Other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou*s Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages,. and 
in the differentiation and/or proliferation of various cell types. Furthermore, the^ 
protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as w.ell as.- 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:67 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of .a-b, where a is any integer between I to 2420 of SEQ ID NO:67, b is an 
integer of 15 to 2434, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:67, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN EiNCODED BY GENE NO: 58 
This gene is expressed primarily in human lung cancer. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
-reagents for differential identification of the tissue(s) or cell type{s) present in a 
5 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, pulmonary diseases and/or disorders, particularly cancers of the lung. 
'Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
lype(s). For a number of disorders of the above tissues or cells, particularly of the 

10 immune system, expression of this gene at significantly higher or lower levels is 

routinely detected in cenain tissues or cell types (e.g., pulmonary, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, pulmonary lavage, 
pulmonary surfactant, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 181 as residues: Phe-39 to Asp-45. Polynucleotides 
encoding said polypeptides are also provided. 

20 The tissue distribution in lung cancer tissue indicates that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment of 
immune system disorders such as ARDS, cystic fibrosis, and cancer, particularly lung 
cancer. This protein may play a role in the regulation of cellular division, and may 
show utility in the diagnosis, treatment, and/or prevention of developmental diseases 

25 and disorders, including cancer, and other proliferative conditions. Representative 
uses are described in the *'Hyperproliferative Disorders" and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental tissues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 

30 death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 



wo 00/06698 



149 



PCT/US99/17130 



potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
5 detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 

10 proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 

15 antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:68 and may have been publicly available prior to conception of 

20 the present invention. Preferably, such related polynucleotides are specifically 

excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1072 of SEQ ID NO:68, b is an 

25 integer of 15 to 1086. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:68, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 59 

This gene is expressed primarily in larynx carcinoma and early stage human 

30 lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the ti.ssue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental, gastrointestinal, and pulmonary diseases and/or 
disorders, particularly larynx carcinoma. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
5 differential identification of the tissue{s) or cell type(s). For a number of disorders of 
the above tissues or cells, panicularly of the immune system, expression of this gene 
at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g., developmentaK gastrointestinal, pulmonary, and cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, amniotic fluid, pulmonary lavage, 

10 sputum, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 

15 epitopes shown in SEQ ID NO: 182 as residues: His-42 to Lys-49. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in larynx carcinoma and early stage human lung 
indicates that polynucleotides and polypeptides corresponding to this gene are useful 
for treating immune system disorders such as cancer, particularly larynx carcinoma. 

20 This protein may play a role in the regulation of cellular division, and may show 
utility in the diagnosis, treatment, and/or prevention of developmental diseases and 
disorders, including cancer, and other proliferative conditions. Representative uses are 
described in the "Hyperproliferative Disorders" and "Regeneration" sections below 
and elsewhere herein. Briefly, developmental tissues rely on decisions involving cell 

25 differentiation and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative di.sorders. such as spinal mu.scular atrophy (SMA). Because of 

30 potential roles in proliferation and differentiation, this gene product may have 

applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
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polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
5 degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 

10 markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 

15 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:69 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

20 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 1248 of SEQ ID NO:69, b is an 
integer of 15 to 1262, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:69, and where b is greater than or equal to a + 14. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: MEVTTEDTSRTDVSEPATSGGAADGVTSIAPTAVASSTTAASITTA 
ASSMTVASSAPTTAASSTTVASIAPTTTASSMTAASSTPMTLALPAPTSTXTGR 

TPSTTATGHPSLSTALAQVPKSSALPRTATLATLATRAQTVATTANTSSPMST 
30 RPSPSKHMPSDTAASPVPPMXPQAQGPISQVSVDQPVVNTTXKSTXMPSNTT 
XEPLTQAVVDKTLLLVVLLLGVTLFITVLVLFALQAYESYKKKDYTQVDYLI 
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NGMYADSEM (SEQ ID NO: 322). Polynucleotides encoding these polypeptides are 
also provided. 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
5 the present invention. Specifically, polypeptides of the invention comprise the 

following amino acid sequence: ARCPELPGLRCRPRPRAGPQAPSYCPRATRPPG 
ACCARMRLLLEWRVYLRLTCATKDGMARECPTTWLSPPAKPDFAQRHSVK 
PTALQGGRWSRLGASP (SEQ ID NO: 323). Polynucleotides encoding these 
polypeptides are also provided. 

10 This gene is expressed primarily in adipocytes, osteoblasts, cerebellum, 

hypothalamus and Hodgkin's lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

15 not limited to, metabolic, skeletal, neural, and immune diseases and/or disorders, 

particularly Hodgkin's lymphoma. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 

20 significantly higher or lower levels is routinely detected in cenain tissues or cell types 
(e.g., metabolic, skeletal, neural, immune, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e.. the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 183 as residues: Pro-33 to Gln-40, Gly-5 1 to Arg-56. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in Hodgkin's lymphoma cells indicates that 

30 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of immune system disorders such as cancer, panicularly Hodgkin's 
lymphoma. The secreted protein can also be used to determine biological activity, to 
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raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify 
agents that modulate their interactions, and as nutritional supplements. It may also 
have a very wide range of biological activities. Representative uses are described in 
the "Chemotaxis" and "Binding Activity" sections below, in Examples 11, 12, 13, 14, 
15, 16, 18, 19. and 20, and elsewhere herein. Briefly, the protein may possess the 
following activities: cytokine, cell proliferation/differentiation modulating activity or 
induction of other cytokines; imniunostimulating/inimunosuppressant activities (e.g. 
for treating human immunodeficiency virus infection, cancer, autoimmune diseases 
and allergy); regulation of hematopoiesis (e.g. for treating anemia or as adjunct to 
chemotherapy); stimulation or growth of bone, cartilage, tendons, ligaments and/or 
nerves (e.g. for treating wounds, stimulation of follicle stimulating hormone (for 
control of fertility); chemotactic and chemokinetic activities (e.g. for treating 
infections, tumors); hemostatic or thrombolytic activity (e.g. for treating hemophilia, 
cardiac infarction etc.); anti-inflammatory activity (e.g. for treating septic shock, 
Crohn's Disease); as antimicrobials: for treating psoriasis or other hyperproliferative 
diseases: for regulation of metabolism, and behavior. Also contemplated is the use of 
the corresponding nucleic acid in gene therapy procedures. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databa.ses. Some of these sequences are 
related to SEQ ID NO:70 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 1628 of SEQ ID NO:70, b is an 
integer of 15 to 1642, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:70, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 61 

The translation product of this gene shares sequence honnology with 
polypeptide in the cystatin family. Cysiatin polypeptides are cysteine protease 
- inhibitors. For an analysis of the composition of several members of the cystatin 
5 family see Gene ( 1987) 6 1(3):329-338, incorporated herein by reference. The cystatin 
activity of polypeptides encoded by this gene is measured by several assays known in 
"the art including assays described in coowned. copending US Patent Application 
-Serial No. 08/744,138, incorporated herein by reference. Preferred polypeptides of the 
invention comprise the following amino acid sequence: LPATVEFAVHTFNQQSKD 
10 TYAYRLGHILNSWKEQVESKTVFSMELLLGRTRCGKFEDDIDNCHFQESTEL 
NNTFTCFFTISTRPWMTQFSLLNKTC (SEQ ID NO: 324). Fragments of such 
polypeptides having cystatin activity (cysteine protease inhibitory activity are 
particularly preferred). Polynucleotides encoding such polypeptides are also provided. 
In another embodiment, polypeptides comprising the amino acid sequence of 
15 the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: LLWARGLGRAKSAVPTVST MLGLPWKGGLS 
WALLLLLLGSQILLIYAWHFHEQRDCDEHNVMARYLPATVEFAVHTFNQQS 
KDYYAYRLGHILNSWKEQVESKTVFSMELLLGRTRCGKFEDDIDNCHFQE 
20 STELNxNTFTCFFTISTRPWMTQFSLLNK TCLEGFH (SEQ ID NO: 325). 
Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in testes and epididiymus. For a review of a 
cystatin showing testes- specific expression see Mol. Endocrinol. ( 1992 Oct.) 
6( 10): 1653- 1 664, incorporated herein by reference. 
25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to» They should therefore serve a protective function to regulate the 
activities of such endogenous proteinases, which otherwise may cause uncontrolled 
30 proteolysis and tissue damage. Cysteine proteinase activity can normally not be 
measured in body fluids, but can been detected exiracellularly in conditions like 
endotoxin-induced sepsis, metastasizing cancer, and at local inflammatory processes 
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in rheumatoid arthritis . purulent bronchiectasis and periodontitis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune, 
5 expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g.. reproductive, testicular, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine,, seminal fluid, synovial fluid and 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 

10 healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 184 as residues: Phe-31 to Asp-38, Asn-59 to Tyr-65, 
Ser-76 to Glu-82, Thr-96 to Cys-108, Gin- 1 1 1 to Asn-1 18. Polynucleotides encoding 
said polypeptides are also provided. 

15 The tissue distribution in testes and epididiymus, combined with the 

homology to cystatins indicates that polynucleotides and polypeptides corresponding 
to this gene are useful for the treatment and diagnosis of conditions concerning proper 
testicular function (e.g. endocrine function, sperm maturation), as well as cancer. 
Therefore, this gene product is useful in the treatment of male infertility and/or 

20 impotence. This gene product is also useful in assays designed to identify binding 

agents, as such agents (antagonists) are useful as male contraceptive agents. Similarly, 
the protein is believed to be useful in the treatment and/or diagnosis of testicular 
cancer. The testes are also a site of active gene expression of transcripts that is 
expressed, particularly at low levels, in other tissues of the body. Therefore, this gene 

25 product is expressed in other specific tissues or organs where it may play related 
functional roles in other processes, such as hematopoiesis, inflammation, bone 
formation, and kidney function, to name a few possible target indications. 
Representative uses are described in the "Hyperproliferative Disorders" and 
"Regeneration" sections below and elsewhere herein. Briefly, developmental tissues 

30 rely on decisions involving cell differentiation and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
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of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
^detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Cysteine proteinase inhibitors of the cystatin superfamily are ubiquitous in the 
body and are generally ught-binding inhibitors of papain-Iike cysteine proteinases, 
such as cathepsins B, H, L, S, and K. They should therefore serve a protective 
function to regulate the activities of such endogenous proteinases, which otherwise 
may cause uncontrolled proteolysis and tissue damage. Cysteine proteinase activity 
can normally not be measured in body fluids, but can been detected exiracellularly in 
conditions like endotoxin-induced sepsis, metastasizing cancer, and at local 
inflammatory processes in rheumatoid arthritis, purulent bronchiectasis and 
periodontitis, which indicates that a tight cystatin regulation is a necessity in the 
normal state. A deficiency state in which the levels of the intracellular cystatin, 
cystatin B, are lowered due to mutations has recently been shown to segregate with a 
form of progressive myoclonus epilepsy, which points to additional specialized 
functions of cystaiins. Moreover, results showing that chicken cystatin inhibits polio 
virus replication, human cystatin C inhibits corona- and herpes simplex virus 
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replication, and human cystatin A inhibits rhabdovirus-induced apoptosis in cell 
cultures indicates that cystatins play additional roles in the human defense system. 
The cystatins constitute a superfamily of evolutionarily related proteins, all composed 
of at least one 100-120 residue domain with conserved sequence motifs. 

The previously well characterized single-domain human members of this 
superfamily could be grouped in two protein families. The Family 1 members, 
cystatins (or stefms) A and B. contain approximately 100 amino acid residues, lack 
disulfide bridges, and are not synthesized as preproteins with signal peptides. The 
Family 2 cystatins (cystatins C, D, S, SN, and SA) are secreted proteins of approx. 
120 amino acid residues (Mr 13,000- 14,000) and have two characteristic intrachain 
disulfide bonds. Recently, we identified an additional human cystatin superfamily 
member by ESTl sequencing in epithelial cell derived cDNA libraries which we 
named cystatin E. The same cystatin was independently discovered by differential 
display experiments as a mRNA species down-regulated in breast tumor tissue, but 
present in the surrounding epithelium and reported under the name cystatin M . 
Cystatin E/M is an atypical, secreted low-Mr cystatin in that it is a glycoprotein and 
just shows 30-35% sequence identity in alignments with the human Family 2 
cystatins, which shows that additional cystatin families are yet to be identified. The 
cystatin E/M gene has been localized to chromosome 2 , whereas all human Family 2 
cystatin genes are clustered on the short arm of chromosome 20. which further 
stresses that cystatin E/M is just distantly related to the other secreted human low-Mr 
cystatins. It is believed therefore, that polypeptides encoded by this gene are useful in 
diagnosing and treating disease consistent with the aforementioned conditions in 
which cystatins are implicated. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID N0:71 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 907 of SEQ ID NO:71, b is an 
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integer of 15 to 92 1, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:7I, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 62 
5 The translation product of this gene shares sequence homology with 

Neutrophil Gelatinase-Associated Lipocalin which is thought to be important in 
.:j,immune regulation (See Genbank and Geneseq Accession Nos. emblCAA58 127.1, 
;:and US5627034. respectively: all references and information available through these 
:-accessions are hereby incorporated herein by reference; for example, Biochem. 
10 -:Biophys. Res. Commun. 202 (3), 1468-1475 (1994), and FEES Lett. 314 (3), 386-388 
..(1992)). 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
15 following amino acid sequence: LEQKLELHRGGGRSRTSGSPGLQEFGTREERGE 
GEQRTGREFSGNGGRAVEAARMRLLCGLWLWLSLLKVLQAQTPTPLPLPP 
PMQSFQGNQFQGEWFVLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWN 
AMTRGQHCDTWSYVLIPAAQPGQFTVDHGVGRSWLLPPGTLDQFICLGRAQ 
GLSDDNIVFPDVTGXALDL XSLPWVAAPA (SEQ ID NO: 326). Polynucleotides 
20 :-encoding these polypeptides are also provided. 

This gene is expressed primarily in epididiymus and osteoclastoma. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
25 not limited to, reproductive and skeletal di.seases and/or disorders, particularly cancers 
such as osteoclastoma testicular cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene 
30 at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g., reproductive, testicular, skeletal, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, seminal fluid, urine, synovial fluid and spinal fluid) 
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or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
5 epitopes shown in SEQ ID NO: 185 as residues: Met-82 to Thr-90. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in epididiymus and homology to neutrophil gelatinase- 
associated lipocalin indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for diagnosis and treatment of skin diseases and immune system 

10 disorders such as cancer, particularly osteoclastoma. The secreted protein can also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 
cognate ligands or receptors, to identify agents that modulate their interactions, and as 
nutritional supplements. It may also have a very wide range of biological activities. 
Representative uses are described in the "Chemotaxis" and "Binding Activity" 

15 sections below, in Examples 11, 12, 13, 14, 15, 16, 18, 19, and 20, and elsewhere 
herein. Briefly, the protein may possess the following activities: cytokine, cell 
proliferation/differentiation modulating activity or induction of other cytokines; 
immunostimulating/immunosuppressant activities (e.g. for treating human 
immunodeficiency virus infection, cancer, autoimmune diseases and allergy); 

20 regulation of hematopoiesis (e.g. for treating anemia or as adjunct to chemotherapy); 
stimulation or growth of bone, cartilage, tendons, ligaments and/or nerves (e.g. for 
treating wounds, stimulation of follicle stimulating hormone (for control of fertility); 
chemotactic and chemokinetic activities (e.g. for treating infections, tumors); 
hemostatic or thrombolytic activity (e.g. for treating hemophilia, cardiac infarction 

25 etc.); anti-inflammatory activity (e.g. for treating septic shock, Crohn's Disease); as 
antimicrobials; for treating psoriasis or other hyperproliferative diseases; for 
regulation of metabolism, and behavior. Also contemplated is the use of the 
corresponding nucleic acid in gene therapy procedures. Protein, as well as. antibodies 
directed against the protein may show utility as a tumor marker and/or 

30 immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO: 72 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
•cumbersome. Accordingly, preferably excluded from the present invention are one or 
5 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 892 of SEQ ID NO:72, b is an 
Tinteger of 15 to 906, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 72, and where b is greater than or equal to a + 14. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 63 

The translation product of this gene was shown to have homology to colipase 
•which plays an essential role in the intestinal fat digestion by anchoring lipase on 
lipid/water interfaces in the presence of bile salts (See Genbank Accession No. 
gblAAA03513.1; all references and information available through this accession are 
15 hereby incorporated by reference herein). 

This gene is expressed primarily in epididiymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

20 not limited to, reproductive diseases and/or disorders, particularly epididiymus-related 
diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 

25 lower levels is routinely detected in certain tissues or cell types (e.g., reproductive, 
metabolic, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
seminal fluid, bile, chyme, urine, synovial fluid and spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from 

30 an individual not having the disorder. 

Preferred polypeptides of the present invention compri.se immunogenic 
epitopes shown in SEQ ID NO: 186 as residues: Ile-40 to Cys-49, Arg-52 to Cys-57, 
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Ser-94 to Trp-99, Gly-105 lo Gly-1 II. Polynucleotides encoding said polypeptides 
are also provided. 

The tissue distribution in epididiymus indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
immune system diseases and disorders of the epididiymus. Polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
conditions concerning proper testicular function (e.g. endocrine function, sperm 
maturation), as well as cancer. Therefore, this gene product is useful in the treatment 
of male infertility and/or impotence. This gene product is also useful in assays 
designed to identify binding agents, as such agents (antagonists) are useful as male 
contraceptive agents. Similarly, the protein is believed to be useful in the treatment 
and/or diagnosis of testicular cancer. The testes are also a site of active gene 
expression of transcripts that is expressed, particularly at low levels, in other tissues 
of the body. Therefore, this gene product is expressed in other specific tissues or 
organs where it may play related functional roles in other processes, such as 
hematopoiesis, inflammation, bone formation, and kidney function, to name a few 
possible target indications. Furthermore, the protein may also be used to determine 
biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as. antibodies directed against the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 
tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:73 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 666 of SEQ ID NO:73, b is an 
integer of 15 to 680, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID .NO:73. and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: MCVCERKRGREKEGGVTPTMTSNFPFCTLILGI 
AQAQACPGCPGDWPGLGSGVGEGLHHIRTCRTPIPCSPPAPAAACLGSGH 
ARLPCVLRLWPVPANLSSPFRLEALHCSFWSSPLLPAPHLAFFGFRDLLTDFL 
LAACLLTFQKTPLELPMAVVHLLVATPCYQMLDNLPLPSAAAN WC (SEQ ID 
NO: 327). Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in melanocytes and placenta and to a lesser 
extent in bone marrow and many cells of the immune system, including B-cells». 
dendritic cells, and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, skin cancer and disorders of the reproductive and immune systems. 
Sinnilarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
reproductive and immune systems, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues and cell types (e.g., reproductive, 
tissue, hematopoietic tissue, melanocytes and cells and tissue of the immune system, 
and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, amniotic 
fluid, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such a disorder, relative to the standard gene expression 
level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 

The tissue distribution in melanocytes indicates that polynucleotides and 
polypeptides corresponding to this gene arc useful for the diagnosis and treatment of 
disorders affecting the skin, the reproductive system, and the immune system, 
particularly cancers. Representative uses are described in the "Biological Activity", 
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"HypeqDroiiferative Disorders'\ "infectious disease", and "Regeneration" sections 
below, in Example 11. 19. and 20. and elsewhere herein. Briefly, the protein is useful 
in detecting, treating, and/or preventing congenital disorders (i.e. nevi, moles, 
freckles, Mongolian spots, hemangiomas, port-wine syndrome), integumentary 
5 tumors (i.e. keratoses. Bowen's Disease, basal cell carcinoma, squamous cell 

carcinoma, malignant melanoma. Paget's Disease, mycosis fungoides, and Kaposi's 
sarcoma), injuries and inflammation of the skin (i.e. wounds, rashes, prickly heat 
disorder, psoriasis, dermatitis), atherosclerosis, uticaria, eczema, photosensitivity, 
autoimmune disorders (i.e. lupus erythematosus, vitiligo, dermatomyositis, morphea, 

10 scleroderma, pemphigoid, and pemphigus), keloids, striae, er\'thema, petechiae, 
purpura, and xanthelasma. In addition, such disorders may predispose increased 
susceptibility to viral and bacterial infections of the skin (i.e. cold sores, warts, 
chickenpox. molluscum contagiosum. herpes zoster, boils, cellulitis, erysipelas, 
impetigo, tinea, althletes foot, and ringworm). iMoreover, the protein product of this 

15 gene may also be useful for the treatment or diagnosis of various connective tissue 
disorders (i.e., arthritis, trauma, tendonitis, chrondomalacia and inflammation, etc.), 
autoimmune disorders (i.e., rheumatoid arthritis, lupus, scleroderma, 
dermatomyositis, etc.), dwarfism, spinal deformation, joint abnormalities, arnd 
chondrodysplasias (i.e. spondyloepiphyseal dysplasia congenita, familial 

20 osteoarthritis, Atelosteogenesis type II. metaphyseal chondrodysplasia type Schmid). 
Furthermore, the protein may also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions, in addition to its use ais a nutritional supplement. 
Protein, as well as, antibodies directed against the protein may show utility as a tumor 

25 marker and/or inmiunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:74 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 

30 excluded from the scope of the present invention. To list ever)' related sequence is 

cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between 1 to 1619 of SEQ ID NO:74, b is an 
integer of 15 to 1633, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:74, and where b is greater than or equal to a + 14. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: YLWGRPRLRMRAGTSPSAPWGEKREKLGHKLPVALQGYHPWIL 
LECTVFWARVVLACFSLYLIRGPNCINRQPEPTYQKACNLDCSSDFGQER 
-APAWELLGPESEQRLREYTAQGLQSLASSHRWRQFKTEGKMRGGASPLPWLI 

10 CFWLCSYKGSDNSLKPVVPGPTLCPQSLVSPSVHPSTRSASLGRHRAEAA 
(SEQ ID NO: 328). Polynucleotides encoding these polypeptides are also provided. 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 

15 following amino acid sequence: MPGILAGIPVKDLCLSLLQGFRLLLLCVCPGWL 
SGWMGGQKGSPRIVDIG (SEQ ID NO: 329). Polynucleotides encoding these 
polypeptides are also provided. This gene maps to chromosome 15, accordingly, 
polynucleotides of the invention is used in linkage analysis as a marker for 
chromosome 15. 

20 - This gene is expressed primarily in brain and breast and to a lesser extent in 
the liver, pancreas, and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of disea.ses and conditions which include, but are 

25 not limited to, disorders affecting the brain and CNS, the reproductive system, or the 
immune system. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous system, the reproductive system, and the immune 

30 system, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g., brain and other tissue of the nervous 
system, mammary tissue, eendocrine tissue, hepatic tis.sue, reproductive tissue, cells 
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and tissue of the immune system, cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 188 as residues: Met-37 to Ser-43. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in brain cells indicates that polynucleotides and 
polypeptides corresponding to this gene are u.seful for the diagnosis and treatment of 
disorders affecting the central nervous system, the reproductive system, and the 
immune system, including cancers. Representative uses are described in the 
"Regeneration" and "Hyperproliferative Disorders" sections below, in Example 1 1, 
15, and 18, and elsewhere herein. Briefly, the uses include, but are not limited to the 
detection, treatment, and/or prevention of Alzheimer's Disease, Parkinson's Disease, 
Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities. ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its u.se as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO:75 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
iexcluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1008 of SEQ ID NO:75, b is an 
iriteger of 15 to 1022, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:75, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: AKGEERKEAFSLKiMVQLSSEPISFGLMYLYLGV 
FFHLIYPGALSITTLGKHSHPFFTAEQNSTVWMEHTLFHQSPVASHLVCFQSF 
AFSE (SEQ ID NO: 330). Polynucleotides encoding these polypeptides are also 
provided. 

This gene is expressed primarily in the brain and the immune system, in 
particular T-cells. 

" Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissueis) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, disorders affecting the brain, such as Alzheimer's or disorders affecting 
the immune system, such as AIDS. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s ) or cell lype(s). For a number of disorders of the above 
tissues or cells, particularly of the brain and CNS and the immune systems, expression 
of this gene at significantly higher or lower levels is routinely detected in certain 
tissues and cell types (e.g., brain and other tissue of the nervous system, cells and 
tissue of the immune system, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
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expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in brain cells and tissues indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the diagnosis and treatment 
5 of disorders affecting the brain and CNS or disorders affecting the immune system. 
Representative uses are described in the "Regeneration" and "Hyperproliferative 
Disorders" sections below, in Example 11, 15, and 18, and elsewhere herein. Briefly, 
the uses include, but are not limited to the detection, treatment, and/or prevention of 
Alzheimer's Disease. Parkinson's Disease. Huntington's Disease, Tourette Syndrome, 
10 meningitis, encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, 
trauma, congenital malformations, spinal cord injuries, i.schemia and infarction, 
aneurysms, hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder, depression, panic disorder, learning disabilities, ALS, 
psychoses, autism, and altered behaviors, including disorders in feeding, sleep 

15 patterns, balance, and perception. In addition, elevated expression of this gene product 
in regions of the brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 

20 to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 

identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
Many polynucleotide sequences, such as EST sequences, are publicly 

25 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO;76 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are .specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

30 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 1 170 of SEQ ID NO:76. b is an 
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-integer of 15 to 1 184. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:76. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 67 
5 The translation product of this gene shares sequence homology with 

penaeidin-2 which is thought to be a members of a new family of antimicrobial 
t peptides from the hemolymph of shrimps Penaeus vannamei. The molecules display 
. antimicrobial activity against fungi and bacteria with a predominant activity against 
Gram-positive bacteria. 
10 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: GPAHPASPPLMTLSLQLAELVHFVCAFQSQWT 
GVYPMMPPLKPTEPLCFA CVPCRV (SEQ ID NO: 331). Polynucleotides 
15 . encoding these polypeptides are also provided. 

This gene is expressed primarily in spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
^biological sample and for diagnosis of diseases and conditions which include, but are 

20 "not limited to, immune and hematopoietic diseases and/or disorders, particularly 
disorders affecting the spleen, including bacterial and fungal infections. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the lissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, panicularly of the hemaiopoetic 

25 arid immune systems, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues and cell types (e.g., immune, hematopoietic, and 
cells and tissue of the immune system, cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

30 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from 
an individual not having the disorder. 
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The tissue distribution in spleen and homology to the penaeidin family of 
antibiotics indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the diagnosis and treatment of disorders affecting the spleen, especially 
fungal and bacterial infections. Representative uses are described in the "Immune 
5 Activity'* and "infectious disease*' sections below, in Example 11, 13, 14, 16, 18, 19, 
20, and 27, and elsewhere herein. Briefly, the expression of this gene product 
indicates a role in regulating the proliferation; survival; differentiation: and/or 
acnvation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 
10 other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
15 as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft- versus-host diseases, or autoimmunity 
disorders, such as autoimmune infenility, lense tissue injury, demyelination, systemic 
20 lupus erythemaiosis, drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Di.sease. and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
25 in the differentiation and/or proliferation of various cell types. Furthermore, the 

protein may also be used to determine biological activity, rai.se antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
30 immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO:77 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
5 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 298 of SEQ ID NO:77, b is an 
integer of 15 to 312, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:77, and where b is greater than or equal to a + 14. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

Contact of ceils with supernatant expressing the product of this gene has been 
shown to increase the permeability of the plasma membrane of THP-1 cells to 
calcium. Thus it is likely that the product of this gene is involved in a signal 
transduction pathway that is initiated when the product binds a receptor on the surface 

15 of the plasma membrane of both monocytes, in addition to other cell-lines or tissue 
cell types. Thus, polynucleotides and polypeptides have uses which include, but are 
not limited to, activating immune and hematopoietic cells and tissue cell types. 
Binding of a ligand to a receptor is known to alter intracellular levels of small 
molecules, such as calcium, potassium and sodium, as well as alter pH and membrane 

20 potential. Alterations in small molecule concentration can be measured to identify 
supernatants which bind to receptors of a particular cell. 

Moreover, when tested inTF-1 cell lines, the protein product of this gene has 
been shown to alter the steady-state messenger RNA levels of the following genes: c- 
fos, c-jun, egr-K b561, bcl-2, CD40. cyclin D2. GADPH. ICER. IV1AD3, p21. STAT3, 

25 ID3, and STAT- 1 . When tested in U937 cell lines, the protein product of this gene has 
been shown to alter the steady-state messenger RNA levels of the following genes: 
egr2, MKPl, ATF3. B562. cyclin D: cyclin D2, GATA3;MAD3, p21,TGF, DHFR, 
and JAK3. Based upon these results, it is anticipated that polynucleotides and 
polypeptides corresponding to this gene are useful as agonists or antagonists of the 

30 above referenced genes. Such activity is useful in therapeutic and/or diagnostic 
applications as referenced and more specifically discussed elsewere herein. 
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In specific embodimenis, polpeptides of the invention comprise the 
sequence:MLLEVYGDSISVTVAIPL (SEQ ID NO:332), 

MHSPCQSKAADGLGKSETE (SEQ ID NO: 333), and/or MLKSLGLSTN (SEQ ID 
NO: 334). Polynucleotides encoding these polypeptides are also provided. 
5 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame, upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: AQRLAEECFYMLLEVYGDSISVTVAIPLMHSP 
CQSKAADGLGKSETEMLKSLGLSTNMSPFHLLGLKVFLTWALTLAQICLY 
10 FFEVQPLGLLALNFFCTATAGLKELCMHPPSLAFTPEFHTSLSPLAIPSFCGTS 
VSLSNSHTIPLSLYLPFPSKSRMPDTLHLLVHSLPLVHSQVLPVKDVTIEWPLC 
QRCLGSTCH Q (SEQ ID NO: 335). Polynucleotides encoding these polypeptides 
are also provided. 

The polypeptide of this gene has been determined to have a transmembrane 

15 domain at about amino acid position 1 1 - 27 of the amino acid sequence referenced in 
Table 1 for this gene. Moreover, a cytoplasmic tail encompassing amino acids 28 to 
143 of this protein has also been determined. Based upon these characteristics, it is 
believed that the protein product of this gene shares structural features to type la 
membrane proteins. 

20 This gene is expressed primarily in neutrophils and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type( s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, disorders and/or diseases affecting the immune system. Similarly, 

25 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the the immune 
system, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues and cell types (e.g., immune, hematopoietic, cells and tissue 

30 of the immune system, and cancerous and wounded tissues) or bodily fluids (e.g., 

serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
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expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 191 as residues: Pro-97 to Asp- 104. Polynucleotides 
5 encoding said polypeptides are also provided. 

The tissue distribution in neutrophils and t-cells. combined with the delected 
calcium flux biological activity indicates that polynucleoudes and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of disorders 
affecting the immune system. Representative uses are described in the "Immune 

10 Activity" and "infectious disease" sections below, in Example 1 U 13, 14, 16, 18, 19, 
20, and 27, and elsewhere herein. Briefly, the expression of this gene product 
indicates a role in regulating the proliferation: survival: differentiation: and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene 
product is involved in the regulation of cytokine production, antigen presentation, or 

15 other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 

20 cLs AIDS, leukemia, rheumatoid arthritis, granulomatou s Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host di.sea.ses, or autoinununity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 

25 lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 

30 in the differentiation and/or proliferation of various cell types. Furthermore, the 

protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
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interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
5 available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:78 and may have been publicly available prior to conception of 
the present invemion. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
10 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1356 of SEQ ID NO:78, b is an 
integer of 15 to 1370, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:78. and where b is greater than or equal to a + 14. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signaJ peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: WIPRAAGIRHEVQVSLFQMFCFSSIFCSH 
20 EHTHLPGTFWLFLFLFLILPPSCPCFLPFSLAIETVRWPCWHHPTSFELCY 
PGTSIYYASRGGPXPNSEX (SEQ ID NO: 336). Polynucleotides encoding these 
polypeptides are also provided. 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of disea.ses and conditions which include, but are 
not limited to,- diseases and/or disorders affecting the immune system, and neutrophils 
in particular. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
30 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells. 

particularly of the immune system, expression of this gene at significantly higher or 
lower levels is routinely detected in certain ti.ssues and cell types (e.g., blood cells. 
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and cells and tissue of the immune system, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
" standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution in neutrophils indicates that polynucleotides and 
" polypeptides corresponding to this gene are useful for the diagnosis and treatment of 
disorders affecting the immune system and neutrophils in particular. Representative 
uses are described in the "Immune Activity*' and "infectious disease" sections below, 

10 in Example 1 1, 13, 14, 16, 18, 19, 20. and 27, and elsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation: 
survival: differentiation; and/or activation of hematopoietic cell lineages, including 
blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes suggesting a usefulness in the 

15 - treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
- product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, irrununodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 

20 bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-grafi and graft-versus-host disea.ses. or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythemaiosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 

25 Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injur>'. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 

30 protein may also be u.sed to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as; 
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antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences* such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
5 related to SEQ ID NO:79 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically . 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
10 formula of a-b, where a is any integer between 1 to 354 of SEQ ID NO:79, b is an 
integer of 15 to 368. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:79, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

15 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: XNXKSPLTIGNKSWSSTAVAAALELVDPPGCR 
NSARDSPELVHLGKGRPRKLMTYLFCSSISLLLLKVHSSGHQDIRKAKSKVP 

20 RLLIIQCPQQRE (SEQ ID NO: 337). Polynucleotides encoding these polypeptides 
are also provided. 

This gene is expressed primarily in smooth mu.scle. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, disorders affecting smooth muscle tissue, particularly vascular 
conditions. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type{s). For a number of disorders of the above tissues or cells, 

30 particularly of smooth muscle tissue expression of this gene at significantly higher or 
lower levels is routinely delected in certain tissues or cell types (e.g., muscle, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
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fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 
Preferred polypeptides of the present invention comprise immunogenic 
5 epitopes shown in SEQ ID NO: 193 as residues: Ser-18 to Val-31. Polynucleotides 
encoding said polypeptides are also provided. 

- The tissue distribution primarily in smooth muscle indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of disorders affecting smooth muscle tissue. Moreover, the 

10 - protein is useful in the detection, treatment, and/or prevention of a variety of vascular 
disorders and conditions, which include, but are not limited to miscrovascular disease, 
vascular leak syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery 
disease, arteriosclerosis, and/or atherosclerosis. Funhermore, the protein may also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 

15 cognate ligands or receptors, to identify agents that modulate their interactions, in 

addition to its use as a nutritional supplement. Protein, as well as, antibodies directed 

- against the protein may show utility as a tumor marker and/or immunotherapy targets 
for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
20 ' available and accessible through .sequence databases. Some of the.se sequences are 

related to SEQ ID NO:80 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
25 more polynucleotides comprising a nucleotide .«;equence described by the general 
formula of a-b, where a is any integer between 1 to 1074 of SEQ ID NO: 80, b is an 
integer of 15 to 1088. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:80, and where b is greater than or equal to a + 14. 



30 



FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
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the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: GPEENLSPSTPSQMPTIWVKLCLLQVCHGLFP 
LLKHWSQPMPLCVTLAPVSYWL (SEQ ID NO: 338). Polynucleotides encoding 
these polypeptides are also provided. 
5 This gene is expressed primarily in fetal heart, smooth muscle, and frontal 

cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

10 not limited to, muscular, vascular, or neural diseases and/or disorders, particularly 
defects or injury to cardiac muscle. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the cardiovascular system, expression of this gene at 

15 significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., muscular, vascular, neural, developmental, and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, amniotic fluid, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 

20 tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in fetal heart indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosing and treating defects 
to the heart either due to injury or congenital defects. .Moreover, the protein is useful 
in the detection, treatment, and/or prevention of a variety of va.scular disorders and 

25 conditions, which include, but are not limited to miscrova.scular disea.se, vascular leak 
syndrome, aneurysm, stroke, embolism, thrombosis, coronary anery disease, 
arteriosclerosis, and/or atherosclerosis. Alternatively, polynucleotides and 
polypeptides corresponding to this gene are useful for the detection, treatment, and/or 
prevention of neurodegenerative disease states, behavioral disorders, or inflammatory 

30 conditions. Representative uses are described in the "Regeneration" and 

"Hyperproliferative Disorders" sections below, in Example 11. 15, and 18. and 
elsewhere herein. Briefly, the uses include, but are not limited to the detection. 
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treatment, and/or prevention of Alzheimer's Disease. Parkinson s Disease, 
Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
5 mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities. ALS. psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. Furthermore, the protein may also be used to determine 
10 biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents thai modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as, antibodies directed agamst the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 
tissues. 

15 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:8 1 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are. specifically 
excluded from the scope of the present invention. To list every related sequence is 

20 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 1848 of SEQ ID NO:8 1, b is an 
integer of 15 to 1862, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:8 1, and where b is greater than or equal to a + 14. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

The translation product of this gene shares sequence homology with adipose 
complement related protein which is thought to be important in regulating energy 
metabolism, insulin levels and fat stores. Moreover, the protein product of this gene 
30 has also been shown to have homology to the complement subcomponent CIQ A- 
chain precursor and HP-25 protein (See Genbank and Geneseq Accession Nos. 
emblCAA41664.1. dbjlBAA02352.1. and W98013; all references and information 
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available through this accession are hereby incorporated by reference herein). Based 
on the sequence similarity, the translation product of this gene is expected to share at 
least some biological activities with complement proteins. 

In another embodiment, polypeptides comprising the amino acid sequence of 
5 the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: PRVRKEPEAMQWLRVRESPGEATGHRVTMG 
TAALGPVWAALLLFLLMCEIPMVELTFDRAVASDCQRCCDSEDPLDPAHVSS 
ASSSGRPHALPEIRPYINITILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSK 
10 GDKGEMGSPGAPCQKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGC 
FDMATGQFAAPLRGIYFFSLNVHSWNYKETYVHIMHNQKEAVILYAQPS 
ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITFSGHLIKA 
EDD (SEQ ID NO: 339). Polynucleotides encoding these polypeptides are also 
provided. 

15 This gene is expressed primarily in placenta and. fetal kidney, and umbilical 

vein and to a lesser extent in fetal heart, fetal liver/spleen, microvascular endothelial 
cells and cancers of the lung and pharynx. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s ) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, vascular, renal, and reproductive diseases and/or disorders, panicularly 
cancers of the lung and pharynx. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell lype(s). For a number of disorders of the above 

25 tissues or cells, particularly of the pulmonary and immune systems, expression of this 
gene at significantly higher or lower levels is routinely detected in certain tissues of 
cell types (e.g., vascular, renal, reproductive, immune, hematpoieiic. pulmonary, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid and spinal fluid) or another tissue or cell sample taken from an individual having 

30 such a disorder, relative to the standard gene expression level, i.e.. the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 195 as residues: Asp-36 to Asp-48, Ser-57 to His-62, 
Lys-77 to Gly-84. Met-92 to GIy-1 14, Gin-203 to Ile-209, Lys-23 1 to Tyr-239. 
Polynucleotides encoding said polypeptides are also provided. 
5 The tissue distribution in pharynx or lung, combined with the homology to 

adipose complement related proteins indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing and treating cancers of the 
pharynx or lung by modifying the metabolic balance in such tissues. Moreover, the 
protein is useful in the detection, treatment, and/or prevention of a variety of vascular 

10 disorders and conditions, which include, but are not limited to miscrovascular disease, 
vascular leak syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery . 
disease, arteriosclerosis, andy'or atherosclerosis. The gene product may also be 
involved in lymphopoiesis, therefore, it can be used in immune disorders such as 
infection, inflammation, allergy, immunodeficiency etc. In addition, this gene product 

15 may have commercial utility in the expansion of stem cells and committed 

progenitors of various blood lineages, and in the differentiation and/or proliferation of 
various cell types. Furthermore, the protein may also be used to determine biological 
activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, 
to identify agents that modulate their interactions, in addition to its use as a nutritional 

20 supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 82 and may have been publicly available prior to conception of 

25 the present invention. Preferably, such related polynucleotides are specifically 

excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a>b. where a is any integer between 1 to 1604 of SEQ ID NO:82, b is an 

30 integer of 15 to 1618. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:82, and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN EiNCODED BY GENE NO: 73 

The translation product of this gene shares sequence homology with a 
hypothetical 54.7 kD protein (F37A4.1) from Caenorhabditis elegans (SwissProt 
locus YPTI_CAEEL, accession P41879). The protein product of this gene also has 
5 homology to the human NG26 which is thought to contain a human major 
histocompatibility complex class III and is involved in T-cell maturation (See 
Genbank Accession No, gblAAD 18079. II (AF129756): all references and information 
available through this accession are hereby incorporated by reference herein; for 
example, J. Neurochem. 69 (6), 2516-2528 (1997)). Based on the sequence similarity, 

10 the translation product of this gene is expected to share at least some biological 
activities with nitric oxide synthase proteins. 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: MLYPGSVYLLQKALMPVLLQGQARLVEECNGRRAKLLACDGNE 
IDTiMFVDRRGTAEPQGQKLVICCEGNAGFYEVGCVSTPLEAGYSVLGWNHP 

15 GFAGSTGVPFPQNEANAMDVVVQFAIHRLGFQPQDIIIYAWSIGGFTATWAA 
MSYPDVSAMILDASFDDLVPLALKVMPDSWRGLVTRTVRQHLNLNNAEQLC 
RYQGPVLLIRRTKDEIITTTVPEDIMSNRGNDLLLKLLQHRYPRVMAEEGLRV 
VRQWLEASSQLEEASIYSRWEVEEDWCLSVLRSYQAEHGPDFPWSVGEDMS 
ADGRRQLALFLARKHLHNFEATHCTPLPAQNFQMPWHL (SEQ ID NO: 340). 

20 Polynucleotides encoding such polypeptides are also provided. 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: VCPKWCRFLTMLGHCCYFWQVWPASEALAA 

25 GPTPSTGSSSPSWKQHIGTSLQKTRGSLPTTTLTSGAGQSTSTGKNPAAGR 

SLEGALPAGVWPCFAQSPCTGGQQTP SSTGLRSCLVRSPATWWRTP (SEQ ID 
NO: 341). Polynucleotides encoding these polypeptides are also provided. 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: WIPRAAGIRHEIYREXDSERAPASVPETPTAVTAPHSSSWDTYYQ 

30 PRALEKHADSILALASVFWSISYYSSPFAFFYLYRKGYLSLSKVVPFSHYAG 
TLLLLLAGVACXRGIGRWTNPQYRQFITILEATHRNQSSENKRQLANYNFD 
FRSWPVDFHWEEPSSRKESRGGPSRRGVALLRPEPLHRGTADTLLNRVKKL 
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PCQITSYLVAHTLGRRMLYPGSVYLLQKALMPVLLQGQARLVEECNGRRAK 
LLACDGNEIDTMFVDRRGTAEPQGQKLVICCEGNAGFYEVGCVSTPLEAGYS 
VLGWNHPGFAGSTGVPFPQNEANAMDVVVQFAIHRLGFQPQDIIIYAWSI 
GGFTATWAAMSYPDVSAMILDASFDDLVPLALKVMPDSWRGLVTRTVRQ 
5 HLNLNNAEQLCRYQGPVLLIRRTKDEIITTTVPEDIMSNRGNDLLLKLLQHRY 
PRVMAEEGLRVVRQWLEASSQLEEASIYSRWEVEEDWCLSVLRSYQAEHGP 
DFPWSVGEDMSADGRRQLALFLARKHLHNFEATHCT PLPAQNFQMPWHL 
(SEQ ID NO: 342). Polynucleotides encoding these polypeptides are also provided. A 
preferred polypeptide variant of the invention comprises the following amino acid 

10 sequence: HERAXGPSRGHGELLSCVLGPRLYKIYRERDSERAPASVPETPTA 
VTAPHSSSWDTYYQP RALEKHADSILALASVFWSISYYSSPFAFFYLYRKGY 
LSLSKVVPFSHYAGTLLLLLAGV ACSEALAAGPTPSTGSSSPSWKQHIGTSLQ 
KTRGSLPTTTLTSGAGQSTSTGKNPAAGRSLEGALPAGVWPCFAQSPCTGG 
QQTPSSTGL RSCLVRSPATWWRTP (SEQ ID NO: 343). Polynucleotides 

15 encoding these polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
6. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 6. 

This gene is expressed primarily in cerebellum, pituitary, fetal liver, and 

20 primary dendritic cells and to a lesser extent in in a wide range of tissues and 
developmental stages (i.e. fetal and adult tissue, etc.). 

Therefore, polynucleotides and polypeptides of the invention ore useful as 
reagents for differential identification of the tissue(s) or cell typeis) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

25 not limited to, neural, developmental, and immune diseases and/or disorders. 

particularly those involving self recognition and T- anc B-cell maturation, and cancer. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

30 neural or hormonal system, expression of this gene at significantly higher or lower 
levels is routinely detected in certain tissues or cell types (e.g., neural, developmental, 
immune, hepatic, and cancerous and wounded tissues) or bodily fluids (e.g.. serum. 
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plasma, amniotic fluid, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
5 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 196 as residues: Thr-23 to Lys-34, Leu-41 to Ser-47, 
Ala-57 to Ala-68, Pro-89 to Gly-iOl. Pro-l 10 to Pro- 1 17. Polynucleotides encoding 
said polypeptides are also provided. 

The tissue distribution in developmental and immune cells, combined with the 

10 homology to the human major histocompatibility complex class III region, indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and diagnosis of cancer and other proliferative disorders. Representative 
uses are described in the "Immune Activity" and "infectious disease" sections below, 
in Example 1 1, 13, 14, 16. 18, 19, 20. and 27. and elsewhere herein. Briefly, the 

15 expression of this gene product indicates a role in regulating the proliferation; 
. survival; differentiation; and/or activation of hematopoietic cell lineages, including 
blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes suggesting a usefulness in the 
treatment of cancer (e.g. by boosting immune responses). 

20 Since the gene is expressed in cells of lymphoid origin, the natural gene 

product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including anhritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou s Disease, inflammatory 
bowel disease, sepsis,. acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 

25 such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host disea.ses, or autoimmunity 
disorders, such as autoimmune infertility, Icnse tissue injury, demyelination, systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. .Moreover, the protein may represent a secreted factor that 

30 influences the differentiation or behavior of other blood cells, or that recruits 

hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
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in the differentiation and/or proliferation of various cell types. Potentially, this gene 
product is involved in synapse formation, neurotransmission, learning, cognition, 
homeostasis, or neuronal differentiation or survival. Furthermore, the protein may 
also be used to determine biological activity, raise antibodies, as tissue markers, to 
5 isolate cognate ligands or receptors, to identify agents that modulate their interactions, 
in addition to its use as a nutritional supplement. Protein, as well as, antibodies 
* directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 

10 ^ available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:83 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

15 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2020 of SEQ ID NO: 83. b is an 
integer of 15 to 2034, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:83, and where b is greater than or equal to a + 14. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 74 

The translation product of this gene shares sequence homology with the br-1 
protein from the snail nervous system fEMBL HPBRIGENE) which codes for nitric 
oxide synthetase and which is thought to be important in mediating a variety of 
cellular responses, including vasodilation. Preferred polypeptides of the invention 

25 comprise the following amino acid sequence: MFKRHQRLKKDSTQAEEDLSEQ 
EQNQLNVLKKHGYVVGRVGRTFLYSEEQKDNIPFEFDADSLAFDMENDPVM 
GTHKSTKQVELTAQDVKDAHWFYDTPGITKENCILNLLTEKEVNIVLPTQSIV 
PRTFVLKPGMVLFLGAIGRIDFLQGNQSAWFTVVASNILPVHITSLDRADALY 
QKHAGHTLLQIPMGGKER\1AGFPPLVAEDIMLKEGLGASEAVADIKFSSAG 

30 WVSVTPNFKDRLHLRGYTPEGTVLTVRPPLLPYIVNIKGQRIKKSVAYKTKKP 
PSLMYNVRKKKGKINV (SEQ ID NO: 344). Polynucleotides encoding such 
polypeptides are also provided. 
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A preferred polypeptide fragment of the invention comprises the following 
amino acid sequence: MLPARLPFRLLSLFLRGSAPTAARHGLREPLLERRCAA 
ASSFQHSSSLGRELPYDPVDTEGFGEGGDMQERFLFPEYrLDPEPQPTREKQL 
QELQQQQEEEERQRQQRREERRQQNLRARSREHPVVGHPDPALPPSGVNCS 
5 GCGAXLHCQDAGVPGYLPREKFLRTAEADGGLARTVCQRCWLLSHHRRALR 
LQVSREQYLELVSAALRXPGPSLVLYMVDLLDLPDALLPDLPALVGPKQLIV 
LGNKVDLLPQDAPGYRQRLRERLWEDCARAGLLLAPGTKGHSAPSRTSHR 
TGRIRIRRTGPAQWSGTCG (SEQ ID NO: 345). Polynucleotides encoding these 
polypeptides are also provided. 

10 When tested against U937 cell lines, supernatants removed from cells . 

containing this gene activated the GAS (gamma activating sequence) promoter 
element. Thus, it is likely that this gene activates myeloid cells through the JAK- 
STAT signal transduction pathway. GAS is a promoter element found upstream of 
many genes which are involved in the Jak-STAT pathway. The Jak-STAT pathway is 

15 a large, signal transduction pathway involved in the differentiation and proliferation 
. of cells. Therefore, activation of the Jak-STAT pathway, reflected by the binding of 
the GAS element, can be u.sed to indicate proteins involved in the proliferation and 
differentiation of cells. 

This gene is expressed primarily in early stage human brain, smooth muscle, 

20 and endometrial tumor and. to a lesser extent in a variety of tissues representing many 
organs and developmental states. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but arc 

25 not limited to, cardiovascular, vascular, and neural diseases and/or disorders, 
particularly congestive heart di.sease and neurological disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, paniculariy of the circulatory and 

30 neural systems, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., cardiovascular, vascular, 
neural, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma. 
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urine, amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
5 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 197 as residues: Phe-42 to Leu-48, Pro-53 to Asp-58, 
Pro-81 to Glu-123, Asp-256 to Trp-26.9, GIy-282 to Ser-306, Arg-333 to Gly-339, 
Arg-403 to Gln.425, Ser-446 to Asn-452, His-475 to Gln-480, Gly-592 to Met-597, 
-Pro-635 to His-642, Lys-667 to Lys-672, Lys-678 to Ser-684. Polynucleotides 

10 -encoding said polypeptides are also provided. 

The tissue distribution in smooth muscle and vascular tissues, combined with 
the homology to nitric oxide synthetase indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
congestive heart failure and neurological degenerative disorders, polynucleotides and 

15 polypeptides corresponding to this gene are useful for the detection, treatment, and/or 
prevention of neurodegenerative disease states, behavioral disorders, or inflammatory 
conditions. Representative uses are described in the "Regeneration" and 
"Hyperproliferative Disorders'* sections below, in Example 11, 15. and 18, and 
elsewhere herein. Briefly, the uses include, but are not limited to the detection, 

20 treatment, and/or prevention of Alzheimer's Disease. Parkinson's Disease, 

Huntington's Disease. Touretie Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 

25 learning disabilities, ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 

30 neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 

survival. Moreover, the protein is useful in the detection, treatment, and/or prevention 
of a variety of vascular disorders and conditions, which include, but are not limited to 
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miscrovascular disease, vascular leak syndrome, aneurysm, stroke, embolism, 
thrombosis, coronary arter\' disease, arteriosclerosis, and/or atherosclerosis. 
Furthermore, the protein may also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate iigands or receptors, to identify agents 

5 that modulate their interactions, in addition to its use as a nutritional supplement. 

Protein, as well as, antibodies directed against the protein may show utility as a tumor 
marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

10 related to SEQ ID NO:84 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

15 formula of a-b, where a is any integer between 1 to 2226 of SEQ ID NO:84, b is an 
integer of 15 to 2240, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:84, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIiN ENCODED BY GENE NO: 75 
20 The translation product of this gene shares sequence homology with the 

human KE04p, in addition to an unidentifed C.elegans gene. 

The polypeptide of this gene has been determined to have a transmembrane 

domain at about amino acid position 9 - 25 of the amino acid sequence referenced in 

Table 1 for this gene. Moreover, a cytoplasmic tail encompassing amino acids 1 to 8 
25 of this protein has also been determined. Based upon these characteristics, it is 

believed that the protein product of this gene shares structural features to type II 

membrane proteins. 

In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
30 the present invention. Specifically, polypeptides of the invention comprise the 

following amino acid sequence: PSFRRERVETGGGGPVTHGTEGPFLPLPGGTRM 

NMTQARVLVAAVVGLVAVLLYASIHKIEEGHLAVYYRGGALLTSPSGPGYH 
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IMLPFITTFRSVQTTLQTDEVKNVPCGTSCGVMIYIDRIEVVNMLAPYAVFDIV 
RNYTADYDKTLIFNKIHHELNQFCSAHTLQEVYIELFDQIDENLKQALQKDL 
NLMAPGLTIQAVRVTKPKIPEAIRRNFELMEAEKTKLLIAAQKQKVVEKEA 
ETERKKAVIEAEKIAQVAKIRFQQKVMEKETEKRISEIEDAAFLAREKAKA 
5 DAEYYAAHKYATSNKHKLTPEYLELKKYQAIASNSKIYFGSNIPNMFVDSSC 
ALKYSD IRTGRESSLPSKEALEPSGENVIQNKESTG (SEQ ID NO: 346). 
Polynucleotides encoding these polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
1-0. Accordingly, polynucleotides related to this invention are useful as a marker in 
10 linkage analysis for chromosome 10. 

This gene is expressed primarily in fetal tissue, including 8 week whole 
embryo^ fetal liver spleen, nine week old early stage human, fetal heart. fetal liver, 
fetal lung, and placenta and to a lesser extent in a variety of cancers, and other normal 
tissues, 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the lis.sue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer and diseases of fetal development. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 

20 probes for differential identification of the lissue(s) or cell lypeis). For a number of 
disorders of the above tissues or cells, particularly of the fetal tissues, especially the 
liver, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types {e.g.. developmental, hepatic, immune, 
hematopoietic, pulmonary, cardiovascular, and cancerous and wounded tissues) or 

25 bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e.. the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 

30 epitopes shown in SEQ ID NO: 198 as residues: Leu-68 to Lys-74. Tyr-109 to Lys- 
1 15, Gln-200 to Val-205, Lys-207 to Lys-214, Glu-237 to Ile-244, Ala-271 to Thr- 
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279, Ser-317 to Ser-329, Gln-342 to Gly-348. Polynucleotideii encoding said 
polypeptides are also provided. 

The tissue distribution of this gene (primarily fetal tissue and cancerous tissue, 
both of which are undergoing rapid growth) indicates that polynucleotides and 
5 polypeptides corresponding to this gene are useful for treatment and diagnosis of 
cancer and disorders of fetal development. Moreover, the expression within fetal 
tissue and other cellular sources marked by proliferating cells indicates this protein 
may play a role in the regulation of cellular division, and may show utility in the 
diagnosis, treatment, and/or prevention of developmental diseases and disorders, 

10 including cancer, and other proliferative conditions. Representative uses are described 
in the "Hyperproliferative Disorders" and "Regeneration" sections below and 
elsewhere herem. Briefly, developmental tissues rely on decisions involving cell 
differentiation and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 

15 death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 

20 also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 

25 degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous ce;lls and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 

30 markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as. 
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antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
5 related to SEQ ID NO:85 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
-more polynucleotides comprising a nucleotide sequence described by the general 
10 formula of a-b, where a is any integer between 1 to 1474 of SEQ ID NO:85, b is an 
integer of 15 to 1488, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:85, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 
15 When tested against U937 and Jurkat cell lines, supernatanis removed from 

cells containing this gene activated the GAS (gamma activating sequence) promoter 
element. Thus, it is likely that this gene activates myeloid and T-ceils, and to a lesser 
extent in other immune cells and tissue cell types, through the JAK-STAT signal 
transduction pathway. GAS is a promoter element found upstream of many genes 
20 which are involved in the Jak-STAT pathway. The Jak-STAT pathway is a large, 

signal transduction pathway involved in the differentiation and proliferation of cells. 
Therefore, activation of the Jak-STAT pathway, reflected by the binding of the GAS 
element, can be used to indicate proteins involved in the proliferation and 
differentiation of cells. 

25 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: WSTGNASWEKKDNFILSADFEMMGLGNGRR 
SMKSPPLVLAALVACIIVLGFNYWIASSRSVDLQTRIMELEGRVRRRAAERG 

30 AVELKKNEFQGELEKQREQLDKIQSSHNFQLESVNKLYQDEKAVLVNNITTG 
ERLIRVLQDQLKTLQRNYGRLQQDVLQFQKNQTNLERKFSYDLSQCINQMKE 
VKEQCEERIEEVTKKGNEAVASRDLSENNDQRQQLQALSEPQPRLQAAGL 
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PHTEVPQGKGNVLGNSKSQTPAPSSEVVLDSKRQVEKEETNEIQVVNEE 
PQRDRLPQEPGREQVVEDRPVGGRGFGGAGELGQTPQVQAALXVSQENPE 
MEGPERDQLVIPDGQEEEQEAAGEGRNQQKLRGEDDYNMDENEAESETDKQ 
AALAGNDRNIDVFNVE DQKRDTINLLDQREKRNHTL (SEQ ID NO: 347). 
5 Polynucleotides encoding these polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
9. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 9. 

This gene is expressed primarily in human endometrial tumor and other 

10 tumors and to a lesser extent in a variety of other healthy adult and fetal tissues 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s ) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental diseases and/or disorders, particularly cancer and other 

15 proliferative disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells* particularly of the endometrial tissue, cervix and uterus, expression of 
this gene at significantly higher or lower levels is routinely detected in certain tissues 

20 or cell types (e.g., developmental, reproductive, and cancerous and wounded tissues) 
or bodily fluids (e.g., .serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

25 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 199 as residues: Asn-6 to Lys-12, Leu-65 to Phe-70, 
Glu-73 to His-88, Gin- 123 to Gin- 135, Gin- 142 to Leu- 156, Arg-173 to Gly-181, 
Asp- 189 to Gin- 199, Ser-204 to Arg-209. Glu-219 to Gly-225, Gly-229 to Pro-238, 
Ser-246 to Asn-256, Glu-263 to Arg'276. Polynucleotides encoding said polypeptides 

30 are also provided. 

The tissue distribution in endometrial tissue indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
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endometrial, cervical and uterine cancer. Because of potential roles in proliferation 
and differentiation, this gene product may have applications in the adult for tissue 
regeneration and the treatment of cancers. It may also act as a morphogen to control 
cell and tissue type specification. Therefore, the polynucleotides and polypeptides of 
5 the present invention are useful in treating* detecting, and/or preventing said disorders 
and conditions, in addition to other types of degenerative conditions. Thus this protein 
may modulate apoptosis or tissue differentiation and is useful in the detection, 
treatment, and/or prevention of degenerative or proliferative conditions and diseases. 
The protein is useful in modulating the immune response to aberrant polypeptides, as 

10 may exist in proliferating and cancerous cells and tissues. The protein can also be 
used to gain new insight into the regulation of cellular growth and proliferation. 
Furthermore, the protein may also be used to determme biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
that modulate their interactions, in addition to its use as a nutritional supplement. 

15 Protein, as well as, antibodies directed against the protein may show 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:86 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 

20 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 3 160 of SEQ ID NO:86, b is an 
integer of 15 to 3 174, where both a and b correspond to the positions of nucleotide 

25 residues shown in SEQ ID NO:S6, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 77 

The translation product of this gene shares sequence homology with protein 
disulfide isomerase from Acanthamoeba castcllanii (See Genbank Locus 
30 ACADISPROA accession L28I74, genpep locus 456013) which is thought to be 

important in converting proteins into their native conformations. The protein product 
of this gene was also shown to have homology to a phospholipase C homologue 
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derived from a mast cell cDNA library (See Geneseq Accession No. R99411). All 
references and information available through these accessions are hereby incorporated 
by reference herein - for example. Gene 150(I)» 175-179(1994). 

Included in this invention as preferred domains are endoplasmic reticulum 

5 targeting sequence domain and the thioredoxin family active site domain, which were 
identified using the ProSite analysis tool (Swiss Institute of Bioinformatics). Proteins 
that permanently reside in the lumen of the endoplasmic reticulum (ER) seem to be 
distinguished from newly synthesized secretory proteins by the presence of the C- 
terminal sequence Lys-Asp-Glu-Leu (KDEL) [1,21 While KDEL is the preferred 

10 signal in many species, variants of that signal are used by different species. This 
situation is described in the following table. 



signal Species- 



15 KDEL Vertebrates, Drosophila, Caenorhabditis elegans, plants 

HDEL Saccharomyces cerevisiae, Kluyveromyces iactis, plants 
DDEL Kluyveromyces lactis 

ADEL Schizosaccharomyces pombe (fission yeast) 
SDEL Plasmodium falciparum 

20 

The signal is usually very strictly conserved in major ER proteins but some 
minor ER proteins have divergent sequences (probably because efficient retention of 
these proteins is not crucial to the cell ). Proteins bearing the KDEL-type signal are not 
simply held in the ER. but are selectively retrieved from a post-ER compartment by a 

25 receptor and returned to their normal location. The concensus pattern is as follows: 
(KRHQSA|-|DENQI-E-L>. Thioredoxins are small proteins of approximately one 
hundred amino- acid residues which panicipate in various redox reactions via the 
reversible oxidation of an active center disulfide bond. They exist in either a reduced 
form or an oxidized form where the two cysteine residues are linked in an 

30 intramolecular disulfide bond. Thioredoxin is present in prokaryotes :Mid eukaryotes 
and the sequence around the redox-active disulfide bond is well conserved. 
Bacteriophage T4 also encodes for a thioredoxin but its primary structure is not 
homologous to bacterial, plant and vertebrate thioredoxins. A number of eukaryoiic 
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proteins contain domains evolutionary related to thioredoxin, all of them seem to be 
protein disulphide isomerases (PDI). PDI (EC 5.3.4. 1 ) is an endoplasmic reticulum 
enzyme that catalyzes the rearrangement of disulfide bonds in various proteins. The 
-various forms of PDI which are currendy known are: - PDI major isozyme: a 
5 multifunctional protein that also function as the beta subunil of prolyl 4-hydroxylase 
(EC 1.14.1 1.2), as a component of oligosaccharyl transferase (EC 2.4.1.1 19), as 
thyroxine deiodinase (EC 3.8. 1.4), as glutathione-ihsulin transhydrogena.se (EC 
1.8.4.2) and as a thyroid hormone-binding protein - ERp60 (ER-60; 58 Kd 
microsomal protein). ERp60 was originally thought to be a phosphoinositide-specific 
10 phospholipase C isozyme and later to be a protease. - ERp72. - P5. All PDI contains 
two or three (ERp72) copies of the thioredoxin domain. The concensus pattern is as 
follows: [LIVMF]-[LIVMSTA]-x-[LIVMFYC]-[FYWSTHE]-x(2)-[FYWGTN]-C- 
[GATPLVE]-[PHYWSTAl-C-x(6)-[LIVMFYWT]. The two C s form the redox- 
active bond. 

15 Preferred polypeptides of the invention comprise the following amino acid 

sequence: SLHRFVLSQAKDEL (SEQ ID NO: 348), FIKFFAPWCGHCKALAPTW 
(SEQ ID NO: 349), and/or FIKPYAPWCGHCKTLAPTW (SEQ ID NO: 350). 
Polynucleotides encoding these polypeptides are also provided. 

Further preferred are polypeptides comprising the endoplasmic reticulum 

20 targeting sequence domain and thioredoxin family active site domain of the sequence 
referenced in Table for this gene, and at least 5. 10. 15. 20. 25. 30. 50, or 75 
. additional contiguous amino acid residues of this referenced sequence. The additional 
contiguous amino acid residues is N- terminal or C- terminal to the endoplasmic 
reticulum targeting sequence domain and thioredoxin family active site domain. 

25 Alternatively, the additional contiguous amino acid residues is both N-terminal and 
C-terminal to the endoplasmic reticulum targeting sequence domain and thioredoxin 
family active site domain, wherein the total N- and C-terminal contiguous amino acid 
residues equal the specified number. Based on the sequence similarity, the translation 
product of this gene is expected to share at least some biological activities with 

30 thioredoxin proteins. Such activities are known in the art. some of which are 
described elsewhere herein. 
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In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: RRGRGVPGPRGRRRLWSAACGHCQRLQPTWN 
5 DLGDKYNSMEXAKVYVAKVDCTAHSDVCSAQGVRGYPTLKLFKPGQEAV 
KYQGPRDFQTLENWMLQTLNEEPVTPEPEVEPPSAPELKQGLYELSASNFELH 
VAQGDHFIKFFAPWCGHCKALAPTWEQLALGLEHSETVKIGKVDCTQHY 
ELCSGNQVRGYPTLLWFRDGKKVDQYKGKRDLESLREYVESQLQRTETGA 
TETVTPSEAPVLAAEPEADKGTVLALTENNFDDTIAEGITFIKFYAPWCGHC 

10 KTLAPTWEELSKKEFPGLAGVKIAEVDCTAERNICSKYSVRGYPTLLLFRGGK 
KVSEHSGGRDLDS LHRFVLSQAKDEL (SEQ ID NO: 351). Polynucleotides 
encoding these polypeptides are also provided. 

This gene is expressed primarily in human chondrosarcoma and endothelial 
cells and to a lesser extent in a wide range of normal and diseased adult and fetal 

15 tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or ceil type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, chondrosarcoma and other cancers and proliferative disorders. 

20 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., vascular, skeletal, 

25 developmental, and cancerous and wounded tissues) or bodily fluids (e.g.. serum, 

plasma, urine, amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

30 The tissue distribution in chondrosarcoma, combined with the homology to 

protein disulfide i.somerase and phospholipase C indicates that polynucleotides and 
polypeptides corresponding to this gene are u.seful for diagnosis and treatment of 
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chondrosarcoma and other cancers and prohferative disorders, and possibly as a 
reagent for in vitro production of proteins. Representative uses are described in the 
"Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere 
herein. Briefly, developmental tissues rely on decisions involving cell differentiation 
5 and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 

10 potential roles in proliferation and differentiation, this gene product may have 

applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to coniroi cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 

15 of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 

20 insight into the regulation of cellular growth and proliferation. Moreover, the 
expre.ssion in endothelial cells indicates the protein is useful in the detection, 
treatment, and/or prevention of a variety of vascular disorders and conditions, which 
include, but are not limited to miscrovascular disease, vascular leak syndrome, 
aneurysm, stroke, embolism, thrombosis, coronary artery disease, arteriosclerosis, 

25 and/or atherosclerosis. Furthermore, the protein may also be used to determine 

biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as, antibodies directed against the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 

30 tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO:87 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
5 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2766 of SEQ ID NO:87, b is an 
integer of 15 to 2780, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:87, and where b is greater than or equal to a + 14. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

This gene is expressed primarily in thyroid and thymus 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type( s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

15 not limited to, thyroid diseases including thyroid cancer and diseases of function 
including Grave's Disease, hyper- and hypo- thyroidism as well as Diseases of the 
thymus. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

20 particularly of the endocrine and immune systems, expression of this gene at 

significantly higher or lower levels is routinely detected in cenain tissues or cell types 
(e.g., endocrine, immune, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

25 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in thyroid cells and tissues indicates that 
polynucleotides and polypeptides corresponding to this gene arc useful for diagnosis 
and treatment of disea.ses of the thyroid and thymus. Representative uses are 

30 described in the "Biological Activity", "Hyperproliferativc Disorders", and "Binding 
Activity" sections below, in Example I L 17, 18, 19, 20 and 27. and elsewhere herein. 
Briefly, the protein can be used for the detection, treatment, and/or prevention of 
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Addison's Disease* Cushing's Syndrome, and disorders and/or cancers of the 
pancrease (e.g. diabetes mellitus). adrenal cortex, ovaries, pituitary (e.g.. hyper-, 
hypopituitarism), thyroid (e.g. hyper-, hypothyroidism), parathyroid (e.g. hyper- 
,hypoparathyroidism) , hypothallamus. and testes. Furthermore, the protein may also 
5 be used to determine biological activity, to raise antibodies, as tissue markers, to 

isolate cognate ligands or receptors, to identify agents that modulate their interactions, 
in addition to its use as a nutritional supplement. Protein, as well as, antibodies 
■ directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

10 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:88 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

15 ■ cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1047 of SEQ ID NO:88, b is an 
integer of 15 to 1061. where both a and b correspond to the positions of nucleotide 
•residues shown in SEQ ID NO:88. and where b is greater than or equal to a + 14. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

The translation product of this gene shares sequence homology with collagen 
which is thought to be important as a structural material in a variety of human tissues 
and products including hair, nails, muscle and bone. 
25 A preferred polypeptide fragment of the invention comprises the following 

amino acid sequence: MRPQGPAASPQRLRGLLLLLLLQLPAPSSASEIPKGKQK 
AHSGRGRWWTCIMECAYKGQQECLVETGALGPMAFRVHLGSQVGMDSKEK 
RGNV (SEQ ID NO: 352). Polynucleotides encoding the.se polypeptides are also 
provided. 

30 This gene is expressed primarily in smooth muscle and to a lesser extent in 12 

week old early stage human, epdidymus, healing groin wound, synovial hypoxia. 
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Stromal cells, ulcerative colitis, breast and 8 week old embryo, as well as a variety of 
other normal and diseased cell types from adult and fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer and other proliferative disorders as well a*s Diseases of smooth 
muscle. Similarly, polypeptides and antibodies directed to these polypeptides are. 
useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

10 particularly of the muscular system, expression of this gene at significantly higher or 
lower levels is routinely detected in certain tissues or cell types (e.g., vascular, 
developmental, reproductive, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 202 as residues: Glu-32 to Glu-46, Pro-63 to Ala-71, 
Pro-81 to Lys-90, Ser-97 to Trp-1 II, Lys-130 to Ser-135, Leu-147 to Cys-154, Asp- 

20 179 to Asn-186, Ser-219 to Gly-229, Polynucleotides encoding said polypeptides are 
also provided. 

The tissue distribution in smooth muscle and homology to collagen indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
trearment and diagnosis of diseases of vascular diseases and/or disorders. 

25 Representative uses are described in the "Biological Activity", "Hyperproliferative 
Disorders", "infectious disease", and "Regeneration" sections below, in Example 1 1, 
19, and 20, and elsewhere herein. Briefly, the protein is useful in detecting, treating, 
and/or preventing congenital disorders (i.e. nevi. moles, freckles, Mongolian spots, 
hemangiomas, port-wine syndrome), integumentary tumors (i.e. keratoses. Bowen's 

30 Disea.se, basal cell carcinoma, squamous cell carcinoma, malignant melanoma, 

Paget's Disease, mycosis fungoides, and Kaposi's sarcoma), injuries and inflammation 
of the skin (i.e.woiinds. rashes, prickly heat disorder, psoriasis, dermatitis). 
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atherosclerosis, uticaria, eczema, photosensitivity, autoimmune disorders (i.e. lupus 
erythematosus, vitiligo, dermatomyositis. morphea, scleroderma, pemphigoid, and 
pemphigus), keloids, striae, erythema, petechiae, purpura, and xanthelasma. In 
addition, such disorders may predispose increased susceptibility to viral and bacterial 
5 infections of the skin (i.e. cold sores, warts, chickenpox, molluscum contagiosum, 
herpes zoster, boils, cellulitis, erysipelas, impetigo, tinea, althletes foot, and 
ringworm). 

Moreover, the protein product of this gene may also be useful for the 
treatment or diagnosis of various connective tissue disorders (i.e., arthritis, trauma, 

10 - tendonitis, chrondomalacia and inflammation, etc.). autoimmune disorders (i.e., 
rheumatoid arthritis, lupus, scleroderma, dermatomyositis, etc.), dwarfism, spinal 
deformation, joint abnormalities, amd chondrodysplasias (i.e. spondyloepiphyseal 
dysplasia congenita, familial osteoarthritis. Atelosteogenesis type 11. metaphyseal 
chondrodysplasia type Schmid). Moreover, the protein is useful in the detection, 

15 treatment, and/or prevention of a variety of vascular disorders and conditions, which 
include, but are not limited to miscrovascular disease, vascular leak syndrome, 
aneurysm, stroke, embolism, thrombosis, coronary artery disease, aneriosclerosis, 
and/or atherosclerosis. Furthermore, the protein may also be used to determine 
biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or 

20 receptors, to identify agents that modulate their interactions, in addition to its use as a 
nutritional supplement. Protein, as well as. antibodies directed against the protein may 
show utility as a tumor marker and/or immunotherapy targets for the above listed 
tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
25 available and accessible through sequence databa.ses. Some of these sequences are 

related to SEQ ID NO:89 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
30 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b. where a is any integer between 1 to 1 328 of SEQ ID NO:89, b is an 
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integer of 15 to 1342. where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:89, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 
5 This gene is expressed primarily in inunune cells and to a lesser extent in a 

wide variety of human tissues. 

Therefore* polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

10 not limited to, T cell or B cell leukemia and various immunodeficiencies. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells» particularly of the immune system, 
expression of this gene at significantly higher or lower levels is routinely detected in 

15 certain tissues or cell types (e.g., immune, hematopoietic, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 

20 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 203 as residues: Gly-3 to Gln-9. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in immune cells indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 

25 immune system diseases such as immunodeficiencies and T cell and/or B cell 

leukemia. Representative u.ses are described in the "Immune Activity" and "infectious 
disease" sections below, in Example IK 13, 14. 16, 18, 19. 20. and 27. and el.sewhere. 
herein. Briefiy. the expression of this gene product indicates a role in regulating the 
proliferation: survival; differentiation: and/or activation of hematopoietic cell 

30 lineages, including blood stem cells. This gene product is involved in the regulation of 
cytokine production, antigen presentation, or other processes suggesting a usefulness 
in the treatment of cancer (e.g. by boosting immune responses). 
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Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and grafl-versus-Host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
' Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 
protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:90 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 756 of SEQ ID NO:90, b is an 
integer of 15 to 770, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:90. and where b is greater than or equal to a + 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 

The translation product of this gene shares sequence honnology with IgE 
receptor. See for example, Isolation and Characterization of cDNAs coding for the 
Beta Subunit of the High-affinity Receptor for Immunoglobulin E, Proc. Natl. Acad. 
Sci. USA. (1988 Sep.) 85(17): 6483-6487. Based on the sequence similarity, the 
translation product of this gene is expected to share at least some biological activities 
with IgE receptor proteins. Such activities are known in the art, some of which are 
described elsewhere herein. IgE and its receptors are believed to have evolved as a 
mechanism to protect mammals against parasites. But other and intrinsically 
innocuous antigens can subvert this system to provoke an allergic response. For 
human populations in industrialized couniries, allergy and asthma now represent a far 
greater threat than parasitic infection, and the main impetus for current studies of the 
IgE system is the hope of understanding and intervening in the aetiology of allergic 
diseases. The high-affinity receptor for immunoglobulin (Ig) E (Fc epsilon RI) on 
mast cells and basophils plays a key role in IgE-mediated allergies. Fc epsilon RI is - 
composed of one alpha, one beta, and two gamma chains^ which are all required for 
cell surface expression of Fc epsilon RI, but only the alpha chain is involved in the 
binding to IgE. Fc epsilon RI-IgE interaction is highly species specific, and rodent Fc 
epsilon RI does not bind human IgE. New homolog can be used to develop ant- 
allergic agents. FcR deliver signals when they are aggregated at the cell surface. The 
aggregation of FcR having immunoreceptor lyrosine-based activation motifs (ITAMs) 
activates sequentially src family tyrosine kinases and syk family tyrosine kinases that 
connect transduced signals to common activation pathways shared with other 
receptors. FcR with ITAMs elicit cell activation, endocytosis, and phagocytosis. The 
nature of responses depends primarily on the cell type. The aggregation of FcR 
without IT AM does not trigger cell activation. Most of these FcR internalize their 
ligands, which can be endocyto.sed, phagocytosed. or transcytpsed. The fate of 
internalized receptor-ligand complexes depends on defined sequences in the 
intracytoplasmic domain of the receptors. The coaggregation of different FcR results 
in positive or negative cooperation. Some FcR without ITAM use FcR with ITAM as 
signal transduction subunits. The coaggregation of antigen receptors or of FcR having 
ITAMs with FcR having immunoreceptor tyrosine-based inhibition motifs (ITIMs) 
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negatively regulates cell activation. FcR therefore appear as the subunits of 
multichain receptors whose constitution is not predetermined and which deliver 
adaptative messages as a function of the environment. 

The polypeptide of this gene has been determined to have four transmembrane 
5 domains at about amino acid position 5 1 - 67, 89 - 105, 1 19 - 135, and 190 - 206 of 
the amino acid sequence referenced in Table 1 for this gene. Based upon these 
- characteristics, it is believed that the protein product of this gene shares structural 
features to type Ilia membrane proteins. 

In another embodiment, polypeptides comprising the amino acid sequence of 
10 the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: ETRVKTSLELLRTQLEPTGTVGNTIMTSQPVPN 
ETIIVLPSNVINFSQAEKPEPTNQGQDSLKKHLHAEIKVIGTIQILCGMMVLSL 
GIILASASFSPNFTQVTSTLLNSAYPFIGPFFFIISGSLSIATEKRLTKLLVHSSLV 
15 GSILSALSALVGHILSVKQATLNPASLQCELDKNNIPTRSYVSYFYHDSLYTT 
DCYTAKASLAGXLSLMLICTLLEFCLAVLTAVLRWKQAYSDFPGSVLFLPH 
SYIGNSGMSSKMTHDCGYEELLTS (SEQ ID NO: 353). Polynucleotides encoding 
these polypeptides are also provided. 

A preferred polypeptide fragment of the invention comprises the following 
20 ^amino acid sequence: MMVLSLGIILASASFSPNFTQVTSTLLNSAYPFIGPFFFI 
ISGSLSIATEKRLTKLLVHSSLVGSILSALSALVCniLSVKQATLNPASLQC 
ELDKNNIPTRSYVSYFYHDSLYTTDCYTAKASLAGXLSLxMLICTLLEFCL 
AVLTAVLRWKQAYSDFPGSVLFLPHSYIGNSGMSSKMTHDCGYEELLTS 
(SEQ ID NO: 354). Polynucleotides encoding these polypeptides are also provided. 
25 The gene encoding the disclosed cDNA is believed to reside on chromosome 

1. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 1 . 

This gene is expressed primarily in immune system tissues. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immune system diseases and/or disorders such as cancer. Similarly, 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels is routinely detected in 
5 certain tissues or cell types (e.g.. immune, hematopoietic, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 

10 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 204 as residues: GIn-23 to Lys-39. Glu-150 to Thr- 
158. Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in immune cells and tissues combined with the 
homology to IgE receptor indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of immune system 
disorders. Representative uses are described in the "Immune Activity" and "infectious 
disease" sections below, in Example 1 1, 13, 14. 16, 18, 19. 20, and 27. and elsewhere 
herein. Briefly, the expression of this gene product indicates a role in regulating the 
proliferation; survival; differentiation: and/or activation of hematopoietic cell 

20 lineages, including blood stem cells. This gene product is involved in the regulation of 
cytokine production, antigen presentation, or other processes suggesting a usefulness 
in the treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 

25 immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomaiou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 

30 disorders, such as autoimmune infeaility, lense tissue injury, demyelination. systemic 
lupus erythematosis. drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disea.se, and scleroderma. Moreover, the protein may represent a secreted factor that 
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influences the differentiation or behavior of other blood cells» or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. The gene product may 
5 also be involved in lymphopoiesis, therefore, it can be used in immune disorders such 
as infection, inflanunaiion, allergy, immunodeficiency etc. In addition, this gene 
product may have commercial utility in the expansion of stem cells and committed 
progenitors of various blood lineages, and in the differentiation and/or proliferation of 
various cell types. Furthermore, the protein may also be used to determine biological 

10 activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 
Many polynucleotide sequences, such as EST sequences, are publicly 

15 available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:91 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

20 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1556 of SEQ ID N0:91, b is an 
integer of 15 to 1570, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID N0:91, and where b is greater than or equal to a + 14. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: GASCEGGGAAARAALGVHRSQKALLVFRRTL 

30 SNLLYMPLLRGLLWLQVLCAGPLHTEAVVLLVPSDDGRAFLLRSRLLHPEAH 
VPPAADRGASLQCVLHQAAPKSRPRSPAAGAALLHXPRRTGDEPCREFHGN 
GFPGPTQLTPGECGLPAPSSLLQHASAPVRTGSEGQVVGCPRARGETGEGLSL 
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AFLSSLMFTSRNGLVGC GASCEGGGAAARAALGVHRSQKALLVFRRTLSNL 
LYMPLLRGLLWLQVLCAGPLHTEAVVLLVPSDDGRAFLLRSRLLHPEAHVPP 
AAD RGASLQCVLHQAAPKSRPRSPAAGAALLHXPRRTGDEPCREFHGNGFP 
GPTQLTPGECGLPAPSSLLQHASAPVRTGSEGQVVGCPRARGETGEGLSLA 
FLSSLiMFTSRNGLVGC (SEQ ID NO: 355). Polynucleotides encoding these 
polypeptides are also provided. 

The gene encoding the disclosed cDNA is believed to reside on chromosome 
7. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 7. 

This gene is expressed primarily in activated T cells, and to a lesser extent in a 
wide variety of human tissues. 

Therefore, polynucleoiides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immune and hematopoietic diseases and/or disorders, particularly ... 
immunodeficeincies. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., immune, hematopoietic, and cancerous and wounded tissues) or bodily tluids 
(e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 205 as residues: Pro-67 to Ser-73. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in activated T cells indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
immunodeficiencies. Representative uses are described in the "Immune Activity" and 
"infectious disease" sections below, in Example 11, 13, 14, 16, 18, 19, 20, and 27, and 
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elsewhere herein. Briefly, the expression of this gene product indicates a role in 
regulating the proliferation; survival: differentiation: and/or activation of 
hematopoietic cell lineages, including blood stem cells. This gene product is involved 
in the regulation of cytokine production, antigen presentation, or other processes 
suggesting a usefulness in the treatment of cancer (e.g. by boosting immune 
responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
-immunological disorders including arthritis, asthma, immunodeficiency diseases such 
ias AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 
protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:92 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between 1 to 2936 of SEQ ID NO:92. b is an 
integer of 15 to 2950, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:92, and where b is greater than or equal to a + 14. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 83 

The translation product of this gene was shown to have homology to the 
human transmembrane protein (See Genbank Accesision No. gblAAC5 1 364. 1 1 
(AF000959): all references and information available through this accession are 
hereby incorporated by reference herein; for example, Genomics 42 (2), 245-25 1 

10 (1997)) which is thought to be implicated in velo-cardio-facial syndrome. 

A preferred polypeptide fragment of the invention comprises the following 
amino acid sequence: MGSAALEILGLVLCLVGWGGLILACGLPMWQVTAFLD 
HNIVTAQTTWKGLWMSCVVQSTGTCSAKCTTRCWL (SEQ ID NO: 356), 
Polynucleotides encoding these polypeptides are also provided. 

15 The gene encoding the disclosed cDNA is believed to reside on chromosome 

22. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 22. 

This gene is expressed primarily in dementia brain tissue, and to a lesser 
extent in a wide variety of human tissues, 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sarnple and for diagnosis of diseases and conditions which include, but are 
not limited to. neural diseases and/or disorders, particularly dementia. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

25 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g.. neural, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal Huid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e.. the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 



wo 00/06698 



210 



PCT/US99/17130 



Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 206 as residues: Ser-201 to Tyr-217. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in dementia brain tissue, combined with the homology 
to the transmembrane protein indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of dementia, and 
potentially for veio-cardio-facial syndrome, Repiresentative uses are described in the 
"Regeneration" and "Hyperproliferative Disorders" sections below, in Example 11, 
15, and 18, and elsewhere herein. Briefly, the uses include, but are not limited to the 
detection, treatment, and/or prevention of Alzheimer's Disease, Parkinson's Disease, 
Huntington's Disease, Tourette Syndrome, meningitis, encephalitis, demyelinating 
diseases, peripheral neuropathies, neoplasia, trauma, congenital malformations, spinal 
cord injuries, ischemia and infarction, aneurysms, hemorrhages, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder, depression, panic disorder, 
learning disabilities, ALS, psychoses, autism, and altered behaviors, including 
disorders in feeding, sleep patterns, balance, and perception. In addition, elevated 
expression of this gene product in regions of the brain indicates it plays a role in 
normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to deterniine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as. antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:93 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 



wo 00/06698 



211 



PCT/US99/17130 



formula of a-b, where a is any integer between 1 to 1708 of SEQ ID NO:93, b is an 
integer of 15 to 1722, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:93, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO; 84 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: LKRAPPGPALAKGLLQPSSTFQALETNIGDQVR 
RHSTAVVIREMTSYILiSFVLLIGVGCIEKDQSCPVFGGRKRLHLLFVGGQLRQ 
VRMLRGELSCACYRPHVQALQLGGCTCF (SEQ ID NO: 357). Polynucleotides 
encoding these polypeptides are also provided. 

This gene is expressed primarily in the adult pulmonary system. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cystic fibrosis, bronchitis and any pulmonary disorders in general. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
pulmonary system, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., pulmonary, cardiovascular, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
pulmonary surfactant, pulmonary lavage/sputum, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an iridividual not having the disorder. 

The tissue distribution of this gene only in the pulmonary system indicates that 
it plays a key role in the functioning of the pulmonary system. This would suggest 
that misregulation of the expression of this protein product in the adult could lead to 
lymphoma or sarcoma formation, particularly in the lung and the protein product 
could be used either in the treatment and/or detection of these disease states. The gene 
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or gene product may also useful in the treatment and/or detection of pulmonary 
defects such as pulmonary edema and embolism, bronchitis and cystic fibrosis. 
Furthermore, the protein may also be used to determine biological activity, to raise 
antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents 
5 that modulate their interactions, in addition to its use as a nutritional supplement. 

Protein, as well as, antibodies directed against the protein may show utility as a tumor 
'marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly, 
available and accessible through sequence databases. Some of these sequences are 

10 related to SEQ ID NO:94 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 

15 formula of a-b, where a is any integer between 1 to 62 1 of SEQ ID NO:94, b is an 
integer of 15 to 635, where both a and b correspond to the positions of nucleoude 
residues shown in SEQ ID NO:94, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 85 
20 The translation product of this gene was found to be homologous to CAM 

proteins. Based on the sequence similarity, the translation product of this gene is 
expected to share at least some biological activities with CAM proteins. Such 
activities are known in the art, some of which are described elsewhere herein. 

A preferred polypeptide varient of the invention comprises the following 
25 amino acid sequence: MLCPWRTANLGLLLILTIFLVAEAEGAAQPNNSLM 
LQTSKENHALASSSLCMDEKQITQNYSKVLAEVNTSWPVKMATNAVLC 
CPPIALRNLIIITWEIILRGQPSCTKAYKKETNETKETNCTDERITWVSRPDQ 
NSDLQIRTVAITHDGYYRCIMVTPDGNFHRGYHLQVLVTPEVTLFQNRNRTA 
VCKAVAGKPAAHISWIPEGDCATKQEYWSNGTVTVKSTCHWEVHNVSTV 
30 NCHVSHLTGNKSLYIELLPVPGAKKSSKLYIPYIILTIIILTIVGXIWLLKVNG 
CXKYKLNKPESTPVVEEDEMQPYAFYTEKNNPLXXTTNKVKASEALQSEV 
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DTDLHTL (SEQ ID NO:208). Polynucleotides encoding these polypeptides are also 
provided. 

The polypeptide of this gene has been determined to have a transmembrane 
domain at about amino acid position 271 - 287 of the amino acid sequence referenced 
in Table 1 for this gene. Moreover, a cytoplasmic tail encompassing amino acids 288 
to 348 of this protein has also been determined. Based upon these characteristics, it is 
believed that the protein product of this gene shares structural features to type la . 
membrane proteins. 

This gene is expressed primarily in dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immunodeficiency, tumor necrosis, infection, lymphomas, auto- 
immunities, cancer, metastasis, wound healing, inflammation, anemias 
(leukemia) and other hematopoeitic disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunologicaJ probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels is routinely detected in certain tissues or cell 
types (e.g., immune, hematopoietic, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from 
an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 208 as residues: Asp-53 to Tyr-6l, Pro- 105 to Ile- 
128, Arg-133 to Leu-140, Gln-182 to Ala-188, Pro-205 to Asn-218, Gly-259 to Ala- 
264, Asn-290 to Ser-302, Glu-307 to Tyr-314, Tyr-317 to Lyso32. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in dendritic cells indicates chat polynucleotides and 
polypeptides corresponding to this gene are useful for the diagnosis and treatment of 
immune disorders including: leukemias. lymphomas, auto-immunities. 
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immunodeficiencies (e.g. AIDS), immuno-supressive conditions (transplantation) and 
hematopoeitic disorders. In addition this gene product is applicable in conditions of 
general microbial infection, inflanunation or cancer. Representative uses are 
described in the "Immune Activity" and "infectious disease" sections below, in 
5 Example 1 1, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the 
expression of this gene product indicates a role in regulating the proliferation; 
' survival; differentiation; and/or activation of hematopoietic cell lineages, including 
blood stem cells. This gene product is involved in the regulation of cytokine 
production, antigen presentation, or other processes suggesting a usefulness in the 

10 - treatment of cancer (e.g. by boosting immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 

15 bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 

20 Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineagqs, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 

25 protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

30 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:95 and may have been publicly available prior to conception of 
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the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
5 formula of a-b, where a is any integer between 1 to 3784 of SEQ ID NO:95. b is an 
integer of 15 to 3798, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:95, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 86 

10 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: VIKLICPAAFPVYFQDMARGCVCSLCASVCIFLS 
SLFPLLPSVHSVNIISCLLLSKCFEGLELMCEHL YQLSQLHVLHHIFSYLLCTP 

15 (SEQ ID NO: 358). Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in embryonic tissue and to a lesser extent in - 
in a variety of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, developmental anomalies, fetal deficiencies, cancer and neoplastic 
states. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue{s) or 
cell type(s). For a number of disorders of the above tissues or cells, particularly of the 

25 developing fetus, expression of this gene at significantly higher or lower levels is 

routinely detected in cenain tissues or cell types (e.g., developmental, differentiating, 
and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell sample taken 
from an individual having such ix disorder, relative to the standard gene expression 

30 level, i.e., the expression level in healthy tissue or bodily fluid from an individual not 
having the disorder. 
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The tissue distribution in embryonic tissue indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the diagnosis and treatment of 
developmental anomalies, fetal deficiencies and pre-natal disorders, as well as 
abnormal cell proliferation and/or differentiation, neoplastic states and cancer. 
5 Moreover, the expression within embryonic tissue and other cellular sources marked 
by proliferating cells indicates this protein may play a role in the regulation of cellular 
•division, and may show utility in the diagnosis, treatment, and/or prevention of 
developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
10 and "Regeneration" sections below and elsewhere herein. Briefly, developmental - 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 

15 of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 

20 polynucleotides and polypeptides of the present invention are useful in treating, 

detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 

25 modulating the immune response to aberrant polypeptides, as may exist in 

proliferating and cancerous cells and tissues. The protein can also be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 

30 interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 
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Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:96 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
5 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2669 of SEQ ID NO:96, b is an 
integer of 15 to 2683, where both a and b correspond to the positions of nucleotide 
10 residues shown in SEQ ID NO:96, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

The translation product of this gene shares sequence homology with inter- 
alpha-trypsin inhibitor which is thought to be important in inhibition of trypsin and 

15 other serine proteases (See Genbank Accession No. pirlS3O350IS3O35O; all references 
and information available through this accession are hereby incorporated herein by • 
reference; for example, Eur. J. Biochem. 179 (1), 147-154 (1989), J. Biol. Chem. 264 
(27), 15975-15981 (1989), and J. Biol. Chem. 266 (2), 747-751 (1991)). 

Contact of cells with supernatant expressing the product of this gene has been 

20 shown to increase the permeability of the plasma membrane of THP-1 cells to 
calcium. Thus it is likely that the product of this gene is involved in a signal 
transduction pathway that is initiated when the product binds a receptor on the surface 
of the plasma membrane of both monocytes, in addition to other cell-lines or tissue 
cell types. Thus, polynucleotides and polypeptides have uses which include, but are 

25 not limited to, activating monocytes, and to a lesser extent, other immune and/or 
hematopoietic cells. Binding of a ligand to a receptor is known to alter intracellular 
levels of small molecules, such as calcium, potassium and sodium, as well as alter pH 
and membrane potential. Alterations in small molecule concentration can be measured 
to identify supernatants which bind to receptors of a particular cell. 

30 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
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following amino acid sequence: YXIPGSTHASGRQRGSGRGEDDSGPPPSTVINQ 
NETFANIIFKPTVVQQARIAQNGILGDFIIRYDVNREQSIGDIQVLNGYFVHYF 
APKDLPPLPKNVVFVLDSSASMVGTKLRQTKDALFTILHDLRPQDRFSIIGFS 
NRIKVWKDHLISVTPDSIRDGKVYIHHMSPTGGTDINGVLQRAIRLLNKYVAH 
5 SGIGDRSVSLIVFLTDGKPTVGETHTLKILNNTREAARGQVCIFTIGIGNDVD 
FRLLEKLSLENCGLTRRVHEEEDAGSQLIGFYDEIRTPLLSDIRIDYPPSSVVQ 
ATKTLFPNYFNGSEIIIAGKLVDRKLDHLHVEVTASNSKKFIILKTDVPVRPQK 
AGKDVTGSPRPGGDGEGDXxVHIERLWSYLTTKELLSSWLQSDDEPEKERLRQ 
RAQALAVSYRFLTPFTSMKLRGPVPRMDGLEEAHGMSAAMGPEPVVQSVR 

10 GAGTQPGPLLKKPYQPRIKJSKTSVDGDPHFVVDFPLSRLTVCFNIDGQPGDIL 
RLVSDHRDSGVTVNGELIGAPAPPNGHKKQRTYLRTITILINfCPERSYLEITPS 
RVILDGGDRLVLPCNQSVVVGSWGLEVSVSANANVTVTIQGSIAFVILIHLYK 
KPAPFQRHHLGFYIANSEGLSSNCHGLLGQFLNQDARLTEDPAGPSQNLTHP 
LLLQVGEGPEAVLTVKGHQVPVVWKQRKIYN GEEQXDCWFARNMPPN 

15 (SEQ ID NO: 359). Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in placenta and adipose tissue and to a lesser 
extent in several other organs and tissues including cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
-reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. disorders of developing organs and metabolic diseases, in addition to 
vasculcir diseases and conditions. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the developing systems and metabolic systems, 

expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., reproductive, vascular, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 

30 to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 
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Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 210 as residues: Lys-5 to Lys-10, Asn-33 to Lys-39, 
Asp-48 to Lyso4, Pro-62 to Asp-67, Asn-1 16 to Arg-123, His-157 to Ala-162, Val- 
242 to Lys-249, Val-251 to Asp-264. Polynucleotides encoding said polypeptides are 
also provided. 

The tissue distribution in placenta, combined with the homology to inter- 
alpha-trypsin inhibitor and the detected calcium flux biological acttivity indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of disorders of developing and metabolic systems. This protein may 
play a role in the regulation of cellular division, and may show utility in the diagnosis* 
treatment, and/or prevention of developmental diseases and disorders, including 
cancer, and other proliferative conditions. Representative uses are described in the 
"Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere 
herein. Briefly, developmental tissues rely on decisions involving cell differentiation 
and/or apoptosis in pattern formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent . 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said di.sorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be u.sed to gain new 
insight into the regulation of cellular growth and proliferation. Moreover, the protein 
is useful in the detection, treatment, and/or prevention of a variety of va.scular 
disorders and conditions, which include, but arc not limited to miscrovascular disease. 
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vascular leak syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery 
disease, arteriosclerosis, and/or atherosclerosis. Polynucleotides and polypeptides of 
the invention are also useful for the treatment, detection, and/or prevention of 
^ inflammation, tumor invasion and metastasis, wound healing, liver disease, 
5 disseminated intravascular coagulation, alzheimer's Disease, ophthalmic disease, 
apoptosis, tissue remodeling, intrauterine growth retardation, preeclampsia, 
angiogenesis, cell migration, fetal development, trophoblasi implantation, ovulation, 
pemphigus and psoriasis, and antiviral therapy. Furthermore, the protein may also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 

10 - cognate ligands or receptors, to identify agents that modulate their interactions, in 
addition to its use as a nutritional supplement. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor marker and/or immunotherapy targets 
for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 

15 • available and accessible through sequence databases. Some of these sequences are 

related to SEQ ID NO:97 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 

20 ^ more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 2167 of SEQ ID NO:97, b is an 
integer of 15 to 2181, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:97. and where b is greater than or equal to a + 14. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 88 

The translation product of this gene was shown to have homology to the 
human colon carcinoma antigen NY-CO-7 (See Genbank and Geneseq Accession 
Nos. gblAAC 18038. II (AF039689) and WO9904265; all references available through 
this accession are hereby incorporated herein by reference; for example. Int. J, Cancer 
30 76 (5), 652-658 (1998)). 

This gene is expressed primarily in breast and breast cancer and to a lesser' 
extent in several other organs and tissues including cancers. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, disorders of reproductive organs and the gastrointestinal system, 
5 including cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the reproductive systems, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
10 (e.g., gastrointestinal, reproductive, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, breast milk, chyme, bile, synovial fluid and spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy 
tissue or bodily fluid from an individual not having the disorder. 

15 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 21 1 as residues: Gly-22 to Gly-28, Leu-71 to Phe-77, 
Asn-101 to Val-108, Pro-122 to Ser-127, Arg-149 to Pro-154, Gly-191 to Phe-196, 
Pro- 199 to Thr-21 1. Polynucleotides encoding said polypeptides are also provided. 
The tissue distribution in breast and breast cancer tissue, combined with the 

20 homology to a colon cancer antigen indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of disorders of the 
reproductive systems and cancers. This protein may play a role in the regulation of 
cellular division, and may show utility in the diagnosis, treatment, and/or prevention 
of developmental diseases and disorders, including cancer, and other proliferative 

25 conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
formation. ... 

Dysregulation of apoptosis can result in inappropriate suppression of cell 

30 death, as occurs in the development of some cancers, or in failure to control the extent 
of cell death, as is believed to occur in acquired immunodeficiency and certain 
neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 



wo 00/06698 PCT/US99/17130 

222 

potential roles in proliferation and differentiation, this gene product may have 
applications in the adult for tissue regeneration and the treatment of cancers. It may 

. also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 

I differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein. is useful in 

. modulating the inmiune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 

. insight into the regulation of cellular growth and proliferation. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

;. related to SEQ.ID NO:98 and may have been publicly available prior to conception of 

t'the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 1943 of SEQ ID NO:98, b is an 
integer of 15 to 1957, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:98, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

The translation product of this gene shares sequence homology with the amino 
acid and protein sequence of a Xenopus transmembrane protein of unknown function. 
The very 5'-end of the contig is identical to the mRNA for. the human LGN mosaic 
protein. Based on the sequence similarity, the translation product of this gene is 
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expected to share at least some biological activities with LGN mosaic proteins. Such 
activities are known in the art, some of which are described elsewhere herein. 
Preferred polypeptides of the invention comprise the following amino acid sequence: 
PRVRPPTKALAVTFTTFVTEPLKHIGKGTGEFIKALMKEIPALLHLPVLIIMAL 
5 AILSFCYGAGKSVHVLRHIGGPEREPPQALRPRDRRRQEEIDYRPDGGAGDAD 
FHYRGQMGPTEQGPYAKTYEGRREILRERDVDLRFQTGNKSPEVLRAFDVPD 
AEAREHPTVVPSHKSPVLDTKPKETGGILGEdjPKESSTESSQSAKPVSGQDTS 
GNTEGSPAAEKAQLKSEAAGSPDQGSTYSPARGVAGPRGQDPVSSPCG(SEQ 
ID NO:339). Polynucleotides encoding such polypeptides are also provided. 

10 The gene encoding' the disclosed cDNA is believed to reside on chromosome 

I. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome I. 

This gene is expressed primarily in small intestine and adipocytes and to a 
lesser extent in various other normal and transformed cell types, mostly of endocrine 

15 origin. 

Therefore, polynucleotides and polypeptides of the invention are useful as- • 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, conditions of growth and metabolism. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing immunological 

probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the digestive and endocrine 
systems, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g., metabolic, gastrointestinal, and 

25 cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, bile, 

chyme, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not havmg the 
disorder. 

30 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 212 as residues: Pro-40 to Gly-68. Gly-79 to Arg-93, 
Phe-106 to Glu-1 14, Pro- 122 to His-129, Thr.143 to Gly-149, Gly-155 to AIa.168, . 
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Val-171 to GIy-182, Ala-195 to Pro-207, PrO'214 to Val-220. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in small intestine indicates that polynucleotides and 
-polypeptides corresponding to this gene are useful for study and treatment of 
disorders of growth and metabolism as well as endocrine abnormalities. Furthermore, 
the protein may also be used to determine biological activity, to raise antibodies, as 
tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate 
-their interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
-antibodies directed against the protein may show utility as a tumor marker and/or 
-immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO:99 and may have been publicly available prior to conception of 
the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1098 of SEQ ID NO:99, b is an 
integer of 15 to 1112, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO:99, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 90 

The translation product of this gene shares sequence homology with IgE 
receptor beta chain which is thought to be important in immune function. 
This gene is expressed primarily in kidney medulla tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typets) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. immune and renal diseases and/or disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, panicularly of the immune and renal systems, 
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expression of this gene at significantly higher or lower levels is routinely detected in 
certain tissues or cell types (e.g., immune, renal, urogenital, and cancerous and 
wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid 
and spinal fluid) or another tissue or cell sample taken from an individual having such 
a disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in kidney renal medulla tissue, combined with the 
homology to the IgE receptor beta chain indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the treatment of immune and 
renal disorders. The protein product of this gene could be used in the treatment and/or 
detection of kidney diseases including renal failure, nephriius, renal tubular acidosis, 
proteinuria, pyuria, edema, pyelonephritis, hydronephriiis, nephrotic syndrome, crush 
syndrome, glomerulonephritis, hematuria, renal colic and kidney stones, in addition to 
Wilms Tumor Disease, and congenital kidney abnormalities such as horseshoe 
kidney, polycystic kidney, and Falconi's syndrome. Alternatively, this gene product is 
involved in the regulation of cytokine production, antigen presentation, or other 
processes suggesting a usefulness in the treatment of cancer (e.g. by boosting immune 
responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including arthritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 
tissues, such as host-versus-graft and grafi-versus-hosi diseases, or autoimmunity 
disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis. Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 



f 



wo 00/06698 



226 



PCT/US99/I7130 



protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
-antibodies directed against the protein may show utility as a tumor marker and/or 
5 immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
"available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 100 and may have been publicly available prior to conception 
-^f the present invention. Preferably, such related polynucleotides are specifically 

10 excluded from the scope of the present invention. To list every related sequence is 
-cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between I to 873 of SEQ ID NO: 100, b is an 
integer of 15 to 887, where both a and b correspond to the positions of nucleotide 

15 residues shown in SEQ ID NO: 100, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 

The translation product of this gene shares sequence homology with Diff 40 
gene product (See Genbank Accession No. gblAACSl 134.1: all references and 

20 information available through this reference are hereby incorporated herein). 

Preferred polypeptides of the invention comprise the following amino acid 
sequence: PRVRSIKVTELKGLANHVVVGSVSCETKDLFAALPQVVAVDIN 
DLGTIKLSLEVTWSPFDKDDQPSAASSVNKASTVTKRFSTYSQSPPDTPS 
LREQAFYNiMLRRQEELENGTAWSLSSESSDDSSSPQLSGTARHSPAPRPLV 

25 QQPEPLPIQVAFRRPETPSSGPLDEEGAVAPVLANGHAPYSRTLSHISEASVNA 
ALAEASVEAVGPKSLSWGPSPPTHPAPTHGKHPSPVPPALDPGHSATSST 
LGTTGSVPTSTDPAPSAHLDSVHKSTDSGPSELPGPTHTTTGSTYSAITTTHS 
APSPLTHTTTGSTHKPIISTLTTTGPTLNIIGPVQTTTSPTHTMPSPSSHSNSPQ 
YVDFCSSVCDNIFVHYVIGIFFHTLYSSKTL (SEQ ID NO:360), and/or PRVRS 

30 IKVTELKGLANHVVVGSVSCETKDLFAALPQVVAVDINDLGTIKLSLEVTWSP 
FDKDDQPSAASSVNKASTVTKRFSTYSQSPPDTPSLREQAFYNMLRRQEELE 
NGTAWSLSSESSDDSSSFQLSGTARHSPAP RPLVQQPEPLPIQVAFRRPET 
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PSSGPLDEEGAVAPVLANGHAPYSRTLSHISEASVNAALAEASVEAVGPKSL 
SWGPSPPTHPAPTHGKHPSPVPPALDPGHSATSSTLGTTGSVPTSTD (SEQ ID 
NO: 361). Polynucleotides encoding these polypeptides are also provided. 
Polypeptides of the invention do not consist of the primary amino acid sequence 
5 shown as Geneseq Accession No.W69430. which is hereby incorporated herein by 
reference. 

This gene is expressed primarily in liver and to a lesser extent in gall bladder 

tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, metabolic and endocrine diseases and/or disorders, particularly hepatic 
and gall bladder disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 

15 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the metabolic and endocrine systems, expression of this 
gene at significantly higher or lower levels is routinely detected in certain tissues or 
cell types (e.g., hepatic, metabolic, gall bladder, gastrointestinal, and cancerous and 
wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, bile, synovial 

20 fluid and spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 214 as residues: Val-9 to Cys-14, Pro-42 to Thr-47, 

25 Thr-56 to Ala-64, Asp-88 to Hi.s-98, Cys-128 to Ser-136, Arg-153 to Trp-161. 
Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in liver and gall bladder, combined with the homology 
to the diff 40 gene product indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the study and treatment of endocrine and 

30 metabolic disorders, polynucleotides and polypeptides corresponding to this gene are 
useful for the detection and treatment of liver disorders and cancers. Representative 
uses are described in the "Hyperproliferative Disorders", "infectious disease", and' 
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"Binding Activity" sections below, in Example IK and 27, and elsewhere herein. 
Briefly, the protein can be used for the detection, treatment, and/or prevention of 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
attributable to the differentiation of hepatocyte progenitor cells. Furthermore, the 
protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
" interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 101 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1234 of SEQ ID NO: 101, b is an 
integer of 15 to 1248, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 101. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

The polypeptide of this gene has been determined to have a transmembrane 
domain at about amino acid position 3 - 19 of the amino acid sequence referenced in 
Table 1 for this gene. Based upon these characteristics, it is believed that the protein 
product of this gene shares structural features to type II membrane proteins. 

This gene is expressed primarily in fetal brain and to a lesser extent in 
pancreas tumor, melanocyte and infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, neural diseases and/or disorders, particularly neurodevelopmental 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells* 
particularly of the central nervous system, expression of this gene at significantly 
higher or lower levels is routinely detected in certain tissues or cell types (e.g., neural, 
developmental, and cancerous and wounded tissues) or bodily fluids (e.g.. lymph, 
serum, plasma, amniotic fluid, urine, synovial fluid and spinal fluid) or another tissue 
or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in heialthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in fetal brain tissue indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
developmental disorders of the central nervous system. Representative uses are 
described in the "Regeneration" and "Hyperproliferaiive Disorders" sections below» in 
Example 11, 15, and 18, and elsewhere herein. Briefly, the uses include, but are not 
limited to the detection, treatment, and/or prevention of Alzheimer's Disease, 
Parkinson's Disease, Huntington's Disease, Tourette Syndrome, meningitis, 
encephalitis, demyelinating diseases, peripheral neuropathies, neoplasia, trauma, 
congenital malformations, spinal cord injuries, ischemia and infarction, aneurysms, 
hemorrhages, schizophrenia, mania, dementia, paranoia, obsessive compulsive 
disorder, depression, panic disorder, learning disabilities. ALS. psychoses, autism, 
and altered behaviors, including disorders in feeding, sleep patterns, balance, and 
perception. In addition, elevated expression of this gene product in regions of the 
brain indicates it plays a role in normal neural function. 

Potentially, this gene product is involved in synapse formation, 
neurotransmission, learning, cognition, homeostasis, or neuronal differentiation or 
survival. Furthermore, the protein may also be used to determine biological activity, 
to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to 
identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and/or immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
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related to SEQ ID NO: 102 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
5 more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1827 of SEQ ID NO: 102, b is an 
integer of 15 to 1841, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 102, and where b is greater than or equal to a + 14. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

The translation product of this gene shares sequence homology with a 
probable membrane protein YGL054c -yeast (Saccharomyces cerevisiae). Moreover, 

The translation product of this gene also have homology to the human and 
mouse comichon protein which is known to be necessary for both anterior-posterior 

15 and dorsal-ventral pattern formation in conjunction with the EGF receptor signaling 
process (See Genbank Accession Nos. gblAAC98388,ll (AF104398), and splP52159; 
all references and information available through these accessions are hereby 
incorporated herein by reference: for example. Cell 8 1 (6)» 967-978 (1995)). 

The polypeptide of this gene has been determined to have two transmembrane 

20 domains at about amino acid position 57 - 73, and 121 - 137 of the amino acid 
sequence referenced in Table 1 for this gene. Moreover, a cytoplasmic tail 
encompassing amino acids 1 - 14 of this protein has also been determined. Based 
upon these characteristics, it is believed that the protein product of this gene shares 
structural features to type Ilia membrane proteins. 

25 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: YGCEKTTEGGRRRRRRMEAVVFVFSLLDCCAL 
IFLSVYniTLSDLECDYINARSCCSKLNKWVIPELIGHTIVTVLLLMSLHWF 

30 IFLLNLPVATWNIYRYIMVPSGNMGVFDPTEIHNRGQLKSHMKEAMIKLGFH 
LLC FFMYLYSMILALIND (SEQ ID NO:362). Polynucleotides encoding these 
polypeptides are also provided. 
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The gene encoding the disclosed cDNA is believed to reside on chromosome 
1. Accordingly, polynucleotides related to this invention are useful as a marker in 
linkage analysis for chromosome 1. 

This gene is expressed primarily in activated T-cells and to a lesser extent in 
endometrial tumor, T cell helper II cells, microvascular endothelial cells, Raji cells 
treated with cyclohexamide and umbilical vein endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the lissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to. immune, hematopoietic, and vascular diseases and/or disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell- 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels is 
routinely detected in certain tissues or cell types (e.g., immune, hematopoietic, 
vascular, and cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, 
plasma, amniotic fluid, urine, synovial fluid and spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 216 as residues: Ser-39 to Asn-45. Asn-103 to Ser- 
109. Polynucleotides encoding said polypeptides are also provided. 

The tissue distribution in activated T-cells indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 
immune disorders involving activated T-cells. Representative uses are described in 
the "Immune Activity" and "infectious disease" sections below, in Example 11, 13, 
14, 16, 18, 19, 20, and 27. and elsewhere herein. Briefly, the expression of this gene 
product indicates a role in regulating the proliferation; survival; differentiation; and/or 
activation of hematopoietic cell lineages, including blood stem cells. This gene ■ 
product is involved in the regulation of cytokine production, antigen presentation, or 
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Other processes suggesting a usefulness in the treatment of cancer (e.g. by boosting 
immune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 
product is involved in immune functions. Therefore it is also useful as an agent for 
5 immunological disorders including anhritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou's Disease, inflammatory 
ibowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity: immune reactions to transplanted organs and 
.tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 

10 disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 
hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 

15 the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Moreover, the protein 
is useful in the detection, treatment, and/or prevention of a variety of vascular 
disorders and conditions, which include, but are not limited to miscrovascular disease, 
vascular leak syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery 

20 disease, arteriosclerosis, and/or atherosclerosis. Funhermore. the protein may also be 
used to determine biological activity, to raise antibodies, as tissue markers, to isolate 
cognate ligands or receptors, to identify agents that modulate their interactions, itl 
addition to its use as a nutritional supplement. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor marker and/or immunotherapy targets 

25 for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 103 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 

30 excluded from the scope of the present invention. To list every related sequence is 

cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between 1 to 671 of SEQ ID NO: 103, b is an 
integer of 15 to 685, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 103. and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GExNE NO: 94 

In another embodiment, polypeptides comprising the amino acid sequence of 
the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: ARAPAPSLPPLPSPAPALAPAHSLLGLLLGRMS 
GSSLPSALALSLLLVSGSLLPGPGAAQNVRVQSGQDQ (SEQ ID NO: 363), 
Polynucleotides encoding these polypeptides are also provided. 

This gene is expressed primarily in dendritic cells and to a lesser extent in 
healing abdomen wound, and pancreas islet cell tumor cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell lype(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immune and hematopoietic diseases and/or disorders, particularly 
wound healing disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s.) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels is routinely detected in cenain tissues or cell types 
(e.g., immune, hematopoietic, and cancerous and wounded tissues) or bodily fiuids 
(e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fiuid from 
an individual not having the disorder 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 217 as residues: Gln-34 to Lys-40. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in dendritic cells and early healing wound indicates that 
polynucleotides and polypeptides corresponding to this gene are u.seful for treating 



wo 00/06698 



234 



PCTAJS99/I7130 



wounds to enhance the healing process. Representative uses are described in the 
"Immune Activity" and "infectious disease" sections below, in Example 1 1, 13, 14, 
16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the expression of this gene 
product indicates a role in regulating the proliferation; survival; differentiation; and/or 
5 activation of hematopoietic cell lineages, including blood stem cells. This gene 

product is involved in the regulation of cytokine production, antigen presentation, or 
♦other processes suggesting a usefulness in the treatnient of cancer (e.g. by boosting 
4nimune responses). 

Since the gene is expressed in cells of lymphoid origin, the natural gene 

10 product is involved in immune functions. Therefore it is also useful as an agent for 
immunological disorders including anhritis, asthma, immunodeficiency diseases such 
as AIDS, leukemia, rheumatoid arthritis, granulomatou s Disease, inflammatory 
bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, 
such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and 

15 tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity 

disorders, such as autoimmune infertility, lense tissue injury, demyelination, systemic 
lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's 
Disease, and scleroderma. Moreover, the protein may represent a secreted factor that 
influences the differentiation or behavior of other blood cells, or that recruits 

20 hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in 
the expansion of stem cells and committed progenitors of various blood lineages, and 
in the differentiation and/or proliferation of various cell types. Furthermore, the 
protein may also be used to determine biological activity, raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 

25 interactions, in addition to its use as a nutritional supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 

30 related to SEQ ID NO: 104 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 
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cumbersome. Accordingly, preferably excluded from ihe present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1 154 of SEQ ID NO: 104, b is an 
integer of 15 to 1 168, where both a and b correspond to the positions of nucleotide 
5 residues shown in SEQ ID NO: 104, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

Contact of cells with supernatant expressing the product of this gene has been 
shown to increase the permeability of the plasma membrane of aortic smooth muscle 

10 cells to calcium. Thus it is likely that the product of this gene is involved in a signal 
transduction pathway that is initiated when the product binds a receptor on the surface 
of the plasma membrane of both smooth muscle cells, and in other cell-lines or tissue 
cell types. Thus, polynucleotides and polypeptides have uses which include, but are 
not limited to, activating smooth muscle cells. Binding of a ligand to a receptor is 

15 known to alter intracellular levels of small molecules, such as calcium, potassium and 
sodium, as well as alter pH and membrane potential- Alterations in small molecule 
concentration can be measured to identify supematants which bind to receptors of a 
particular cell. 

This gene is expressed primarily in pancreatic carcinoma, gall bladder and 

20 primary dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typei.s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, metabolic and immune diseases and/or disorders, particularly cancers, 

25 such as pancreatic carcinoma and gall bladder tumor. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the lissuets) or cell type( s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression 
of this gene at significantly higher or lower levels is routinely detected in certain 

30 tissues or cell types (e.g.. metabolic, innmune, hematpoietic. and cancerous and 

wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid 
and spinal fluid) or another tissue or cell s^imple taken from an individual having such 
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a disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 218 as residues: Lys-34 to Ile-41. Polynucleotides 
5 encoding said polypeptides are also provided. 

The tissue distribution in pancreatic carcinoma and gall bladder indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosing 
^and treating cancer, such as pancreatic carcinoma and gall bladder tumors. 
Representative uses are described here and elsewhere herein. Alternatively, the 
10 -detected calcium flux biological activity indicates the protein is useful in the 
detection, treatment, and/or prevention of a variety of vascular disorders and 
conditions, which include, but are not limited to miscrovascular disease, vascular leak 
syndrome, aneurysm, stroke, embolism, thrombosis, coronary anery disease, 
arteriosclerosis, and/or atherosclerosis. Furthermore, the protein may also be used to 
15 -determine biological activity, to raise antibodies, as tissue markers, to isolate cognate 
ligands or receptors, to identify agents that modulate their interactions, in addition to 
its use as a nutritional supplement. Protein, as well as, antibodies directed against the 
protein niay show utility as a tumor marker and/or immunotherapy targets for the 
.above listed tissues, 

20 Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 105 and may have been publicly available prior to conception 
of the pre.sent invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

25 cumbersome. Accordingly, preferably excluded from the pre.sent invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1 161 of SEQ ID NO: 105, b is an 
integer of 15 to 1 175, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 105, and where b is greater than or equal to a -h 14. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

The polypeptide of this gene has been determined to have a transmembrane 
domain at about amino acid position 10 - 26 of the amino acid sequence referenced. in 
Table 1 for this gene. Moreover, a cytoplasmic tail encompassing amino acids 27 to 
48 of this protein has also been determined. Based upon these characteristics, it is 
believed that the protein product of this gene shares structural features to type lb 
membrane proteins. 

This gene is expressed primarily in osteosarcoma, wilm's tumor, ovarian 
cancer and in T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include,. but are 
not limited to, inflammatory diseases and cancers, such as osteosarcoma, wilm's 
tumor and ovarian cancer Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 
(e.g., skeletal, renal, reproductive, immune, and cancerous and wounded tissues) or 
bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative 
to the standard gene expression level, i.e., the expression level in healthy tissue or 
bodily fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 219 as residues: Ser-30 to Pro-35. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of innammator>' 
conditions and cancer, such as osteosarcoma, wilm's tumor and ovarian cancer. 
Moreover, the expression within cellular sources marked by proliferating cells 
indicates this protein may play a role in the regulation of cellular division, and may 
show utility in the diagnosis, treatment, and/or prevention of developmental diseases 
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and disorders, including cancen and other proliferative conditions. Representative 
uses are described in the "Hyperproliferative Disorders" and "Regeneration" sections 
below and elsewhere herein. Briefly, developmental tissues rely on decisions 
involving cell differentiation and/or apoptosis in pattern formation. 
5 Dysregulation of apoptosis can result in inappropriate suppression of cell 

death, as occurs in the development of some cancers, or in failure to control the extent 
:x)f cell death, as is believed to occur in acquired immunodeficiency and certain 
. neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 
^potential roles in proliferation and differentiation, this gene product may have 

10 applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 
of degenerative conditions. Thus this protein may modulate apoptosis or tissue 

15 differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can al.so be used to gain new 
insight into the regulation of cellular growth and proliferation. Furthermore, the 

20 protein may also be used to determine biological activity, to rai.se antibodies, as tissue 
markers, to i.solate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional .supplement. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
inununotherapy targets for the above listed tissues. 

25 Many polynucleotide .sequences, such as EST sequences, are publicly 

available and accessible through sequence-databases. Some of these sequences are 
related to SEQ ID NO: 106 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 
excluded from the scope of the present invention. To list every related sequence is 

30 cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a^b. where a is any integer between 1 to 1007 of SEQ ID NO: 106, b is an 
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integer of 15 to 1021, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 106, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

5 In another embodiment, polypeptides comprising the amino acid sequence of 

the open reading frame upstream of the predicted signal peptide are contemplated by 
the present invention. Specifically, polypeptides of the invention comprise the 
following amino acid sequence: GTSKDCVLYAFLDPOMAVPLFLYIFTLLPLLPF 
LLSLCFSPLTVKRSSSSESKSSL (SEQ ID NO: 364). Polynucleotides encoding 

10 these polypeptides are also provided. 

This gene is expressed primarily in ovarian cancer. 
Therefore, polynucleotides and polypeptides ofthe invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s ) present in a 
biological sample and for diagnosis of diseases and conditions which include^ but are 

15 not limited to. ovarian cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels is routinely detected in certain tissues or cell types 

20 (e.g., reproductive, ovarian, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial amniotic fluid, fluid or spinal fluid i or another ti.ssue 
or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

25 Preferred polypeptides of the present invention comprise immunogenic 

epitopes shown in SEQ ID NO: 220 as residues: Thr-28 to Ser-40. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution in ovarian tissues indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for treating and diagnosing cancer, 

30 e.g., ovarian cancer. [Moreover, the expression within cellular sources marked by 
proliferating cells indicates this protein may play a role in the regulation of cellular 
division, and may show utility in the diagnosis, treatment, and/or prevention of 
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developmental diseases and disorders, including cancer, and other proliferative 
conditions. Representative uses are described in the "Hyperproliferative Disorders" 
and "Regeneration" sections below and elsewhere herein. Briefly, developmental 
tissues rely on decisions involving cell differentiation and/or apoptosis in pattern 
5 formation. 

Dysregulation of apoptosis can result in inappropriate suppression of cell 
death, as occurs in the development of some cancers, or in failure to control the extent 
■of cell death, as is believed to occur in acquired immunodeficiency and certain 
^ neurodegenerative disorders, such as spinal muscular atrophy (SMA). Because of 

10 potential roles in proliferation and differentiation, this gene product may have 

applications in the adult for tissue regeneration and the treatment of cancers. It may 
also act as a morphogen to control cell and tissue type specification. Therefore, the 
polynucleotides and polypeptides of the present invention are useful in treating, 
detecting, and/or preventing said disorders and conditions, in addition to other types 

15 of degenerative conditions. Thus this protein may modulate apoptosis or tissue 
differentiation and is useful in the detection, treatment, and/or prevention of 
degenerative or proliferative conditions and diseases. The protein is useful in 
modulating the immune response to aberrant polypeptides, as may exist in 
proliferating and cancerous cells and tissues. The protein can also be used to gain new 

20 insight into the regulation of cellular growth and proliferation. Furthermore, the 

protein may also be used to determine biological activity, to raise antibodies, as tissue 
markers, to isolate cognate ligands or receptors, to identify agents that modulate their 
interactions, in addition to its use as a nutritional supplement. Protein; as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 

25 immunotherapy targets for the above listed tissues. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 107 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 

30 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
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formula of a-b, where a is any integer between 1 to 8 16 of SEQ ID NO: 107, b is an 
integer of 15 to 830, where both a and b correspond to the positions of nucleotide 
residues shown in SEQ ID NO: 107, and where b is greater than or equal to a + 14. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

This gene is expressed primarily in macrophages and breast cancer tissue and 
to a lesser extent in osteoblasts and smooth muscle. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, immune system dysfunction; inflammation: breast cancer: cancer; 
osteoporosis; osteopetrosis; peristaltic disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune and skeletal 
systems, expression of this gene at significantly higher or lower levels is routinely 
detected in certain tissues or cell types (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Preferred polypeptides of the present invention comprise immunogenic 
epitopes shown in SEQ ID NO: 221 as residues: Glu-16 to Ala-40. Polynucleotides 
encoding said polypeptides are also provided. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and/or treatment of a variety of 
disorders. Expression in macrophages and other hematopoietic cell types indicates 
that this gene product is involved in the regulation of hematopoietic cell survival, 
proliferation, differentiation, or activation. It is involved in the control of such 
processes as immune surveillance, antigen presentation, T cell activation, cytokine 
release, and inflammation. Expression in breast cancer tissue may possibly correlate 
with the diagnosis and differentiation of cancerous tissue from normal breast tissue. 
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Expression in osteoblasts and osteoclasts may implicate this gene product in the 
process of bone turnover, and target it as a likely candidate for the treatment of 
osteoporosis and/or osteopetrosis. Finally, expressio in smooth muscle may indicate 
an involvement in. the normal function of numerous internal organs and in the 
5 function of the digestive system. 

Many polynucleotide sequences, such as EST sequences, are publicly 
available and accessible through sequence databases. Some of these sequences are 
related to SEQ ID NO: 108 and may have been publicly available prior to conception 
of the present invention. Preferably, such related polynucleotides are specifically 

10 excluded from the scope of the present invention. To list every related sequence is 
cumbersome. Accordingly, preferably excluded from the present invention are one or 
more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a is any integer between 1 to 1287 of SEQ ID NO: 108, b is an 
integer of 15 to 1301, where both a and b correspond to the positions of nucleotide 

15 :residues shown in SEQ ID NO; 108. and where b is greater than or equal to a + 14. 
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Table 1 summarizes the information corresponding to each *'Gene No." described 
above. The nucleotide sequence identified as **NT SEQ ID NO:X" was assembled 
from partially homologous ("overlapping") sequences obtained from the "cDNA 
clone ID'* identified in Table 1 and, in some cases, from additional related DNA 
5 clones. The overlapping sequences were assembled into a single contiguous sequence 
of high redundancy (usually three to five overlapping sequences at each nucleotide 
position), resulting in a final sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the conesponding 
deposit number listed in "ATCC Deposit No:Z and Date." Some of the deposits 

10 .contain multiple different clones corresponding to the same gene. "Vector" refers to 
the type of vector contained in the cDNA Clone ID. 

"Total NT Seq.'' refers to the total number of nucleotides in the contig 
identified by "Gene No/' The deposited clone may contain all or most of these 
sequences, reflected by the nucleotide position indicated as *'5' NT of Clone Seq." 

15 and the "3' NT of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID 
NO:X of the putative start codon (methionine) is identified as "5' NT of Start Codon." 
Similarly , the nucleotide position of SEQ ID NO:X of the predicted signal sequence 
is identified as ^^5' NT of First AA of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is 

20 identified as "AA SEQ ID NO: Y," although other reading frames can also be easily 
translated using known molecular biology techniques. The polypeptides produced by 
these alternative open reading frames are specifically contemplated by the present 
invention. 

The first and last amino acid position of SEQ ID NO:Y of the predicted signal 
25 peptide is identified as "First AA of Sig Pep" and "Last AA of Sig Pep.'' The 
predicted first amino acid position of SEQ ID NO: Y of the secreted portion is 
identified as "Predicted First AA of Secreted Portion." Finally, the amino acid 
position of SEQ ID NO:Y of the last amino acid in the open reading frame is 
identified as "Last AA of OR¥r 
30 SEQ ID NO:X and the translated SEQ ID NO: Y are sufficiently accurate and 

otherwise suitable for a variety of uses well known in the art and described further 
below. For instance. SEQ ID NO:X is useful for designing nucleic acid hybridization 
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probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the 
cDNA contained in the deposited clone. These probes will also hybridize to nucleic 
acid molecules in biological samples, thereby enabling a variety of forensic and 
diagnostic methods of the invention. Similarly, polypeptides identified from SEQ ID 
5 NO:Y may be used to generate antibodies which bind specifically to the secreted 
proteins encoded by the cDNA clones identified in Table 1. 

Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 

10 deleted nucleotides cause frame shifts in the reading frames of the predicted amino 
acid sequence. In these cases, the predicted amino acid sequence diverges from the 
actual amino acid sequence, even though the generated DNA sequence may be greater 
than 99.9% identical to the actual DNA sequence (for example, one base insertion or 
deletion in an open reading frame of over 1000 bases). 

15 Accordingly, for those applications requiring precision in the nucleotide . 

sequence or the amino acid sequence, the present invention provides not only the 
generated nucleotide sequence identified as SEQ ID NO:X and the predicted 
translated amino acid sequence identified as SEQ ID NO:Y, but also a sample of 
plasmid DNA containing a human cDNA of the invention deposited with the ATCC, 

20 as set forth in Table 1. The nucleotide sequence of each deposited clone can readily 
be determined by sequencing the deposited clone in accordance with known methods. 
The predicted amino acid sequence can then be verified from such deposits. 
Moreover, the amino acid sequence of the protein encoded by a particular clone can 
also be directly determined by peptide sequencing or by expressing the protein in a 

25 suitable host cell containing the deposited human cDNA, collecting the protein, and 
determining its sequence. 

The present invention also relates to the genes corresponding to SEQ ID 
NO:X, SEQ ID NO: Y, or the deposited clone. The corresponding gene can be 
isolated in accordance with known methods using the sequence information disclosed 

30 herein. Such methods include preparing probes or primers from the disclosed 
sequence and identifying or amplifying the corresponding gene from appropriate 
sources of genomic material. 
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Also provided in the present invention are species homologs. Species 
homologs may be isolated and identified by making suitable probes or primers from 
the sequences provided herein and screening a suitable nucleic acid source for the 
desired homologue. 

5 The polypeptides of the invention can be prepared in any suitable manner. 

Such polypeptides include isolated naturally occurring polypeptides, recombinantly 
■• produced polypeptides, synthetically produced polypeptides, or polypeptides 
- produced by a combination of these methods. Means for preparing such polypeptides 
--are well understood in the art, 
10 The polypeptides may be in the form of the secreted protein, including the 

' mature form, or may be a part of a larger protein, such as a fusion protein (see below). 
It is often advantageous to include an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification , 
such as multiple histidine residues, or an additional sequence for stability during 
1 5 recombinant production. 

The polypeptides of the present invention are preferably provided in an 
isolated form, and preferably are substantially purified. A recombinantly produced 
.version of a polypeptide, including the secreted polypeptide, can be substantially 
purified by the one-step method described in Smith and Johnson, Gene 67:31-40 
20 (1988). Polypeptides of the invention also can be purified from natural or 

recombinant sources using antibodies of the invention raised against the secreted 
protein in methods which are well known in the art. 

Signal Sequences 

25 Methods for predicting whether a protein has a signal sequence, as well as the 

cleavage point for that sequence, arc available. For instance, the method of 
McGeoch, Virus Res. 3:271-286 (1985). uses the information from a short N-terminal 
charged region and a subsequent uncharged region of the complete ( uncleaved) 
protein. The method of von Heinje, Nucleic Acids Res. 14:4683-4690 f 1986) uses the 

30 information from the residues surrounding the cleavage site, typically residues -13 to 
+2, where +1 indicates the amino terminus of the secreted protein. The accuracy of 
predicting the cleavage points of known mammalian secretory proteins for each of 
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these methods is in the range of 75-809t. (von Heinje, supra.) However, the two 
methods do not always produce the same predicted cleavage point(s) for a given 
protein. 

In the present case, the deduced amino acid sequence of the secreted 
polypeptide was analyzed by a computer program called Signal? (Henrik Nielsen et 
al.. Protein Engineering 10:1-6 (1997)), which predicts the cellular location of a 
protein based on the amino acid sequence. As part of this computational prediction of 
localization, the methods of McGeoch and von Heinje are incorporated. The analysis 
of the amino acid sequences of the secreted proteins described herein by this program 
provided the results shown in Table 1 . 

As one of ordinary skill would appreciate, however, cleavage sites sometimes 
vary from organism to organism and cannot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
shown in SEQ ID NO:Y which have an N-terminus beginning within 5 residues..(i.e., 
+ or - 5 residues) of the predicted cleavage point. Similarly, it is also recognized that 
in some cases, cleavage of the signal sequence from a secreted protein is not entirely 
uniform, resulting in more than one secreted species. These polypeptides, and.the 
polynucleotides encoding such polypeptides, are contemplated by the present - 
invention. 

Moreover' the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the 
naturally occurring signal sequence may be further upstream from the predicted signal 
sequence. However, it is likely that the predicted signal sequence will be capable of 
directing the secreted protein to the ER. These polypeptides, and the polynucleotides 
encoding such polypeptides, are contemplated by the present invention. 

Polvnucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential 
properties thereof. Generally, variants are overall closely similar, and. in many 
regions, identical to the polynucleotide or polypeptide of the present invention. 

By a polynucleotide having a nucleotide sequence ai least, for example, 95% 
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"identical" to a reference nucleotide sequence of the present invention, it is intended 
that the nucleotide sequence of the polynucleotide is identical to the reference 
sequence except that the polynucleotide sequence may include up to five point 
mutations per each 100 nucleotides of the reference nucleotide sequence encoding the 
5 polypeptide. In other words, to obtain a polynucleotide having a nucleotide sequence 
at least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides 
in the reference sequence may be deleted or substituted with another nucleotide, or a 
number of nucleotides up to 5% of the total nucleotides in the reference sequence may 
be inserted into the reference sequence. The query sequence may be an entire 

10 sequence shown inTable 1, the ORF (open reading frame), or any fragement specified 
as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%. 96%, 97%, 98% or 99% identical to a nucleotide 
sequence of the presence invention can be determined conventionally using known 

15 computer programs. A preferred method for detemiing the best overall match 
between a query sequence (a sequence of the present invention) and a subject 
sequence, also referred to as a global sequence alignment, can be determined using 
the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. 
Bipsci. (1990) 6:237-245). In a sequence alignment the query and subject sequences 

20 are both DNA sequences. An RNA sequence can be compared by convening U's to 
T's. The result of said global sequence alignment is in percent identity. Preferred 
parameters used in a FASTDB alignment of DNA sequences to calculate percent 
identiy are: Mairix=Unitary, k-tuple=4. .Vlismatch Penalty=l, Joining Penalty=30, 
Randomization Group Length=0, Cutoff Score=L Gap Penalty=5. Gap Size Penalty 

25 0.05. Window Size=500 or the lenght of the subject nucleotide sequence, whichever is 
shorter. 

If the subject sequence is shorter than the query sequence because of 5' or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 
results. This is because the FASTDB program does not account for 5' and 3' 
30 truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5' or 3' ends, relative to the the query sequence, the 
percent identity is corrected by calculating the number of bases of the query sequence 
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that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent 
of the total bases of the query sequence. Whether a nucleotide is matched/aligned is 
determined by results of the FASTDB sequence alignment. This percentage is then 
subtracted from the percent identity, calculated by the above FASTDB program using 
5 the specified parameters, to arrive at a final percent identity score. This corrected 
score is what is used for the purposes of the present invention. Only bases outside the 
5' and 3' bases of the subject sequence, as displayed by the FASTDB alignment, 
which are not matched/aligned with the query sequence, are calculated for the 
purposes of manually adjusting the percent identity score. 

10 For example, a 90 base subject sequence is aligned to a 100 base query 

sequence to determine percent identity. The deletions occur at the 5' end of the 
subject sequence and therefore, the FASTDB alignment does not show a 
matched/alignement of the first 10 bases at 5' end. The 10 unpaired bases represent 
10% of the sequence (number of bases at the 5' and 3' ends not matched/total number 

15 of bases in the query sequence) so 10% is subtracted from the percent identity score 
calculated by the FASTDB program. If the remaining 90 bases were perfectly 
matched the final percent identity would be 90%. In another example, a 90 base, 
subject sequence is compared with a 100 base query sequence. This time the 
deletions are internal deletions so that there are no bases on the 5 or 3* of the subject 

20 sequence which are not matched/aligned with the query. In this ca.se the percent 

identity calculated by FASTDB is not manually corrected. Once again, only bases 5' 
and 3' of the subject sequence which are not matched/aligned with the querj' sequnce 
are manually corrected for. No other manual corrections ore to made for the purposes 
of the present invention. 

25 By a polypeptide having an amino acid sequence at least, for example, 95% 

"identical" to a query amino acid sequence of the present invention, it is intended that 
the amino acid .sequence of the subject polypeptide is identical to the quer>' sequence 
except that the subject polypeptide sequence may include up to five amino acid 
alterations per each 100 amino acids of the qucrj' amino acid .sequence. In other 

30 words, to obtain a polypeptide having an amino acid .sequence at least 95% identical 
to a query amino acid .sequence, up to 5% of the amino acid residues in the subject 
sequence may be inserted, deleted, (indels) or substituted with another amino acid. 
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These alterations of the reference sequence may occur at the amino or carboxy 
terminal positions of the reference amino acid sequence or anywhere between those 
terminal positions, interspersed either individually among residues in the reference 
sequence or in one or more contiguous groups within the reference sequence. 
5 As a practical matter, whether any particular polypeptide is at least 90%, 95%, 

96%, 97%, 98% or 99% identical to, for instance, the amino acid sequences shown in 
Table 1 or to the amino acid sequence encoded by deposited DNA clone can be 
determined conventionally using known computer programs. A preferred method for 
determing the best overall match between a query sequence (a sequence of the present 

10 invention) and a subject sequence, also referred to as a global sequence alignment, 
can be determined using the FASTDB computer program based on the algorithm of 
Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245). In a sequence alignment the 
query and subject sequences are either both nucleotide sequences or both amino acid 
sequences. The result of said global sequence alignment is in percent identity. 

15 Preferred parameters used in a FASTDB amino acid alignment are: Matrix=PAM 0, 
k-tuple=2, Mismatch Penalty=I, Joining Penalty=20. Randomization Group 
Length=0, Cutoff Score=l. Window Si2e=sequence length. Gap Penaliy=5, Gap Size 
Penalty=0.05, Window Size=500 or the length of the subject amino acid sequence, 
whichever is shorter. 

20 If the subject sequence is shorter than the query sequence due to N- or C- 

terminal deletions, not because of internal deletions, a manual correction must be 
made to the results. This is becuase the FASTDB program does not account for N- 
and C-terminal truncations of the subject sequence when calculating global percent 
identity. For subject sequences truncated at the N- and C-iermini, relative to the the 

25 query sequence, the percent identity is corrected by calculating the number of residues 
of the query sequence that are N- and C-terminal of the subject sequence, which are 
not matched/aligned with a corresponding subject residue, as a percent of the total 
bases of the quer>' sequence. Whether a residue is matched/aligned is determined by 
results of the FASTDB .sequence alignment. This percentage is then subtracted from 

30 the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This final percent identity score 
is what is used for the purposes of the present invention. Only residues to the N- and 
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C-termini of the subject sequence, which are not matched/aligned with the query 
sequence, are considered for the purposes of manually adjusting the percent identity 
score. That is, only query residue positions outside the farthest N- and C-terminal 
residues of the subject sequence. 

For example, a 90 amino acid residue subject sequence is aligned with a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of the subject sequence and therefore, the FASTDB alignment does not 
show a matching/alignment of the first 10 residues at the N-terminus. The 10 
unpaired residues represent 10% of the sequence (number of residues at the N- and C- 
termini not matched/total number of residues in the query sequence) so 10% is 
subtracted from the percent identity score calculated by the FASTDB program. If the 
remaining 90 residues were perfectly matched the final percent identity would be 
90%. In another example, a 90 residue subject sequence is compared with a 100 
residue query sequence. This time the deletions are internal deletions so there are no 
residues at the N- or C-termini of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not 
manually corrected. Once again, only residue positions outside the N- and C-terminal 
ends of the subject sequence, as displayed in the FASTDB alignment, which are not 
matched/aligned with the query sequnce are manually corrected for. No other manual 
corrections are to made for the purposes of the present invention. 

The variants may contain alterations in the coding regions, non-coding 
regions, or both. Especially preferred are polynucleotide variants containing 
alterations which produce silent substitutions, additions, or deletions, but do not alter 
the properties or activities of the encoded polypeptide. Nucleotide variants produced 
by silent substitutions due to the degeneracy of the genetic code are preferred. 
Moreover, variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or 
added in any combination are also preferred. Polynucleotide variants can be produced 
for a variety of reasons, e.g.. to optimize codon expression for a particular host 
(change codons in the human mRNA to tho.se preferred by a bacterial host such as E. 
coli). 

Naturally occurring variants are called "allelic variants," and refer to one of 
.several alternate forms of a gene occupying a given locus on a chromosome of an 
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organism. (Genes IL Lewin. B.. ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 
Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 
5 Using known methods of protein engineering and recombinant DNA 

technology, variants may be generated to improve or alter the characteristics of the 
polypeptides of the present invention. For instance, one or more amino acids can be 
deleted from the N-terminus or C-terminus of the secreted protein without substantial 
loss of biological function. The authors of Ron et al., J. Biol. Chem. 268: 2984-2988 
10 (1993), reported variant KGF proteins having heparin binding activity even after 

deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly. Interferon gamma 
exhibited up to ten times higher activity after deleting 8-10 amino acid residues from 
the carboxy terminus of this protein. (Dobeli et al.. J. Biotechnology 7: 199-216 
(1988).) 

15 Moreover, ample evidence demonstrates that variants often retain a biological 

activity similar to that of the naturally occurring protein. For example, Gayle and 
coworkers (J. Biol. Chem 268:22105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine IL-la. They used random mutagenesis to generate over 
3,500 individual IL-la mutants that averaged 2.5 amino acid changes per variant over 

20 the entire length of the molecule. Multiple mutations were examined at every 

possible amino acid position. The investigators found thai "[mjosl of the molecule 
could be altered with little effect on either [binding or biological activity]." (See, 
Abstract.) In fact, only 23 unique amino acid sequences, out of more than 3,500 
nucleotide sequences examined, produced a protein that significantly differed in 

25 activity from wild-type. 

Furthermore, even if deleting one or more amino acids from the N-terminus or 
C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities may still be retained. For example, the ability of 
a deletion variant to induce and/or to bind antibodies which recognize the secreted 

30 form will likely be retained when le.ss than the majority of the residues of the secreted 
form are removed from the N-terminus or C-ierminus. Whether a particular 
polypeptide lacking N- or C-ierminal residues of a protein retains such immunogenic 
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activities can readily be determined by routine methods described herein and 
otherwise known in the art. 

Thus, the invention further includes polypeptide variants which show 
substantial biological activity. Such variants include deletions, insertions, 
inversions, repeats, and substitutions selected according to general rules known in the 
art so as have little effect on activity. For example, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al.. 
Science 247: 1306- 13 10 (1990), wherein the authors indicate that there are two main 
strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural 
selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified. These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
where substitutions have been tolerated by natural selection indicates that these 
positions are not critical for protein function. Thus, positions tolerating amino acid 
substitution could be modified while still maintaining biological activity of the 
protein. 

The second strategy uses genetic engineering to introduce amino acid changes 
at specific positions of a cloned gene to identify regions critical for protein function. 
For example, site directed mutagenesis or alaninc-scannins mutagenesis (introduction 
of single alanine mutations at ever\' residue in the molecule) can be used. 
(Cunningham and Wells, Science 244:1081-1085 (1989).) The resulting mutant 
molecules can then be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors funher indicate which 
amino acid changes arc likely to be permissive at certain amino acid positions in the 
protein. For example, most buried ( within the teniary structure of the protein) amino 
acid residues require nonpoiar side chains, whereas few features of surface side chains 
are generally conserved. Moreover, tolerated conservative amino acid substitutions 
involve replacement of the aliphatic or hydrophobic amino acids Ala. Val. Leu and 
He; replacement of the hydroxy 1 residues Ser and Thr: replacement of the acidic 
residues Asp and Glu: replacement of the amide residues Asn and Gin, replacement of 
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the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, 
and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention 
include (i) substitutions with one or more of the non-conserved amino acid residues, 
where the substituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substituent group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino 
acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art from the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of 
charged amino acids with other charged or neutral amino acids may produce proteins 
with improved characteristics, such as less aggregation. Aggregation of 
pharmaceutical formulations both reduces activity and increases clearance due to the 
aggregate's immunogenic activity. (Pinckard et al., Clin. Exp. Immunol. 2:331-340 
(1967); Robbins et al.. Diabetes 36: 838-845 ( 1987): Cleland et al.. Crit. Rev. 
Therapeutic Drug Carrier Systems 10:307-377 (1993).) 

A further embodiment of the invention relates to a polypeptide which 
comprises the amino acid sequence of the present invention having an amino acid 
sequence which contains at least one amino acid substitution, but not more than 50 
amino acid substitutions, even more preferably, not more than 40 amino acid 
substitutions, still more preferably, not more than 30 amino acid substitutions, and 
still even more preferably, not more than 20 amino acid substitutions. Of course, in 
order of ever-increasing preference, it is highly preferable for a polypeptide to have 
an amino acid sequence which comprises the amino acid sequence of the present 
invention, which contains at least one, but not more than 10, 9. 8. 7. 6. 5, 4, 3. 2 or 1 
amino acid substitutions. In specific embodiments, the number of additions, 
substitutions, and/or deletions in the amino acid sequence of the present invention or 
fragments thereof (e.g.. the mature form and/or other fragments described herein), is 
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1-5, 5-10, 5-25, 5-50, 10-50 or 50-150, conservative amino acid substitutions are 
preferable. 

Polynucl eotide and Polypeptide Fragments 

In the present invention, a "polynucleotide fragment" refers to a short 
polynucleotide having a nucleic acid sequence contained in the deposited clone or 
shown in SEQ ID NO:X. The short nucleotide fragments are preferably at least about 
15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 
nt, and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt 
in length/' for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and 
primers as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 
2000 nucleotides) are preferred. 

Moreover, representative examples of polynucleotide fragments of the 
invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50, 51-100, 101-150. 151-200, 201-250, 251-300, 301-350, 351-400,401- 
450, 451-500, 501-550, 551-600. 651-700, 701-750, 75 1-800, 800-850, 851-900, 901- 
950.951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 1251- 
1300, 1301-1350, 1351-1400. 1401-1450, 1451-1500, 1501-1550, 1551-1600, 1601- 
1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900. 1901-1950, 1951- 
2000. or 2001 to the end of SEQ ID NO:X or the cDNA contained in the deposited 
clone. In this context "about" includes the panicularly recited ranges, larger or 
smaller by several (5, 4, 3. 2, or 1 ) nucleotides, at either terminus or at both termini. 
Preferably, these fragments encode a polypeptide which has biological activity. More 
preferably, these polynucleotides can be used as probes or primers as discussed 
herein. 

in the present invention, a "polypepiide fragment" refers to a short amino acid 
sequence contained in SEQ ID NO: Y or encoded by the cDNA contained in the 
deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a pan or region, most preferably as a 
single continuous region. Representative examples of polypeptide fragments of the 
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invention, include, for example, fragments from about amino acid number 1-20, 21- 
40,41-60,61-80,81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding 
region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 
100, 1 10, 120, 130, 140, or 150 amino acids in length. In this context "about'* 
5 includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) 
amino acids, at either extreme or at both extremes. 

Preferred polypeptide fragments include the secreted protein as well as the 
mature form. Further preferred polypeptide fragments include the secreted protein or 
the mature form having a continuous series of deleted residues from the amino or the 

10 carboxy terminus, or both. For example, any number of amino acids, ranging from 1- 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1 -30, can be 
deleted from the carboxy terminus of the secreted protein or mature form. 
Furthermore, any combination of the above amino and carboxy terminus deletions are 

15 preferred. Similarly, polynucleotide fragments encoding these polypeptide fragments 
are also preferred. 

Also preferred are polypeptide and polynucleotide fragments characterized by 
structural or functional domains, such as fragments that comprise alphia-helix and 
alpha-helix forming regions, beta-sheiet and beta-sheet-forming regions, turn and tum- 

20 forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, 
surface-forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
specifically contemplated by the present invention. Moreover, polynucleotide 

25 fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically 
active fragments are those exhibiting activity similar, but not necessarily identical, to 
an activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable 

30 activity. 

Epitopes & Antihodiej; 
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In the present invention, "epitopes" refer to polypeptide fragments having 
antigenic or immunogenic activity in an animal, especially in a human, A preferred 
embodiment of the present invention relates to a polypeptide fragment comprising an 
epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." In 
contrast, an "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See. for instance, Geysen et al., Proc. Natl, Acad. Sci. USA 
81:3998-4002(1983).) 

Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 (1985) 
further described in U.S. Patent No. 4,63 1 .2 1 1 .) 

In the present invention, antigenic epitopes preferably contain a sequence of at 
least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope, (See, for instance. Wilson 
et al.. Cell 37:767-778 (1984); Sutcliffe, J. G. et al.. Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according 
to methods well known in the art. (See, for instance. Sutcliffe et al., supra: Wilson et 
al., supra; Chow, M. et al„ Proc. Natl. Acad. Sci. USA 82:910-914: and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, 
if it is long enough (at least about 25 amino acids), without a carrier. However, 
immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to 
be .sufficient to raise antibodies capable of binding to. at the very least, linear epitopes 
in a denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 
meant to include intact molecules as well as antibody fragments (such as, for 
example. Fab and F(ab')2 fragments) which are capable of specifically binding to 
protein. Fab and F(ab')2 fragments lack the Fc fragment of intact antibody, clear 
more rapidly from the circulation, and may have less non-specific tissue binding than 
an intact antibody. fWahl et al„ J. Nucl. Med. 24:316-325 (1983).) Thus, these 
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fragments are preferred, as well as the products of a FAB or other immunoglobulin 
expression library. Moreover, antibodies of the present invention include chimeric, 
single chain, and humanized antibodies. 

5 Fu$iQn Protein^ 

Any polypeptide of the present invention can be used to generate fusion 
proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
polypeptide of the present invention can be used to indirectly detect the second 

10 protein by binding to the polypeptide. Moreover, because secreted proteins target 
cellular locations based on trafficking signals, the polypeptides of the present 
invention can be used as targeting molecules once fused to other proteins. 

Examples of domains that can be fused to polypeptides of the present 
invention include not only heterologous signal sequences, but also other heterologous 

15 functional regions. The fusion does not necessarily need to be direct, but may occur 
through linker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics 
of the polypeptide of the pre.sent invention. For instance, a region of additional amino 
acids, panicuiarly charged amino acids, may be added to the N-ierminus of the 

20 polypeptide to improve stability and persistence during purification from the host cell 
or subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facilitate purification. Such regions may be removed prior to final 
preparation of the polypeptide. The addition of peptide moieties to facilitate handling 
of polypeptides are familiar and routine techniques in the art. 

25 Moreover, polypeptides of the present invention, including fragments, and 

specifically epitopes, can be combined with parts of the constant. domain of 
immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 
facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 

30 polypeptide and various domains of the constant regions of the heavy or light chains 
of mammalian inmiunoglobulins.. f EP A 394,827: Traunecker et al.. Nature 33 1 :84- 
86 (1988).) Fusion proteins having disulfide-linked dimeric structures (due to the 
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IgG) can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J, 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-0 464 533 (Canadian counterpart 2045869) discloses fusion 
proteins comprising various ponions of constant region of immunoglobulin molecules 
together with another human protein or part thereof. In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 
would be desired. For example, the Fc portion may hinder therapy and diagnosis if 
the fusion protein is used as an antigen for immunizations. In drug discovery, for 
example, human proteins, such as hIL-5. have been fused with Fc ponions for the 
purpose of high-throughput screening assays to identify antagonists of hIL-5. (See, 
D. Benneu et al., J. Molecular Recognition 8:52-58 ( 1995); K. Johanson et aL, J. Biol. 
Chem. 270:9459-9471 (1995).) 

Moreover, the polypeptides of the present invention can be fused to marker 
sequences, such as a peptide which facilitates purification of the fused polypeptide. 
In preferred embodiments, the marker amino acid sequence is a hexa-histidine 
peptide, such as the tag provided in a pQE vector (QIAGEN. Inc., 9259 Eton Avenue, 
Chatsworth, CA, 9131 1), among others, many of which are commercially available. 
As described in Geniz et al., Proc. Natl. Acad. Sci. USA 86:821-824 ( 1989), for 
instance, hexa-histidine provides for convenient purification of the fusion protein. 
Another peptide tag useful for purification, the "HA" tag. corresponds to an epitope 
derived from the influenza hemagglutinin protein. (Wilson et al.. Cell 37:767 
(1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the present invention. 

Vectors. Ho st Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of 
the present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector may be. for example, a phage, plasmid, viral, or retroviral 
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vector. Retroviral vectors may be replication competent or replication defective. In 
the latter case, viral propagation generally will occur only in complementing host 
cells. 

The polynucleotides may be joined to a vector containing a selectable marker 
5 for propagation in a host. Generally, a plasmid vector is introduced in a precipitate, 
such as a calcium phosphate precipitate, or in a complex with a charged lipid. If the 
vector is a virus, it may be packaged in vitro using ah appropriate packaging cell line 
and then transduced into host cells. 

The polynucleotide insen should be operatively linked to an appropriate 

10 promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac 
promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to 
name a few. Other suitable promoters will be known to the skilled artisan. The 
expression constructs will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 

15 portion of the transcripts expressed by the constructs will preferably include a 

translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

As indicated, the expression vectors will preferably include at least one 
selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 

20 resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin 

resistance genes for culturing in E. coli and other bacteria. Representative examples 
of appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 
Streptomyces and Salmonella typhimurium cells: fungal cells, such as yeast cells; 
insect cells such as Drosophila 52 and Spodoptera Sf9 cells; animal cells such as 

25 CHO, COS, 293, and Bowes melanoma cells; and plant cells. Appropriate culture 
mediums and conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE- 
9, available from QIAGEN. Inc.: pBluescript vectors. Phage.script vectors. pNH8A, 
pNHl6a. pNHlSA, pNH46A. available from Stratagene Cloning Systems, Inc.; and 

30 ptrc99a, pKK223-3, pKK233-3, pDR540. pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTl 
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and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSYL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled anisan. 

Introduction of the construct into the host cell can be effected by calcium 
phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediaied 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory* manuals, such as Davis et al., Basic 
Methods In Molecular Biology (1986). It is specifically contemplated that the 
polypeptides of the present invention may in fact be expressed by a host cell lacking a 
recombinant vector. 

A polypeptide of this invention can be recovered and purified from 
recombinant cell cultures by well-known methods including ammonium sulfate or 
ethanol precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatiie chromatography and lectin chromatography. Most 
preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can 
also be recovered from: products purified from natural sources, including bodily 
fluids, tissues and cells, whether directly isolated or cultured; products of chemical 
synthetic procedures: and products produced by recombinant techniques from a 
prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, 
insect, and mammalian cells. Depending upon the host employed in a recombinant 
production procedure; the polypeptides of the present invention may be glycosylated 
or may be non-glycosylaied. In addition, polypeptides of the invention may also 
include an initial modified methionine residue, in some cases as a result of host- 
mediated processes. . Thus, it is well known in the art that the N-terminal methionine 
encoded by the translation initiation codon generally is removed with high efficiency 
from any protein after translation in all eukaryotic cells. While the N-terminal 
methionine on most proteins also is efficiently removed in most prokaryotes, for some 
proteins, this prokaryotic removal process is inefficient, depending on the nature of 
the amino acid to which the N-terminal methionine is covalently linked. 
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In addition to encompassing host cells containing the vector constructs 
discussed herein, the invention also encompasses primary, secondary, and 
immortalized host cells of vertebrate origin, particularly mammalian origin, that have 
been engineered to delete or replace endogenous genetic material (e.g., coding 
5 sequence), and/or to include genetic material (e.g., heterologous polynucleotide 
sequences) that is operably associated with the polynucleotides of the invention, and 
which activates, alters, and/or amplifies endogenous polynucleotides. For example, 
techniques known in the art may be used to operably associate heterologous control 
regions (e.g.» promoter and/or enhancer) and endogenous polynucleotide sequences 

10 via homologous recombination (see, e.g.. U.S. Patent No. 5,641,670. issued June 24, 
1997: Imernational Publication No. WO 96/294 11. published September 26, 1996; 
International Publication No. WO 94/12650, published August 4. 1994; Koller et al., 
Proc. Natl, Acad. Sci. USA 86:8932-8935 (1989): and Zijlstra et al.. Nature 342:435- 
438 (1989), the disclosures of each of which are incorporated by reference in their 

15 entireties). 

Vs^s of the PolYnucleotjdes 

Each of the polynucleotides identified herein can be used in numerous ways as 
20 reagents. The following description should be considered exemplary and utilizes 
known techniques. 

The polynucleotides of the present invention are useful for chromosome 
identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
25 polymorphisms), are presently available. Each polynucleotide of the present 
invention can be used as a chromosome marker. 

Briefly, .sequences can be mapped to chromosomes hy preparing PCR primers 
(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
30 exon in the genomic DNA. These primers are then used for PCR screening of 

somatic cell hybrids containing individual human chromo.somes. Only those hybrids 
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containing the human gene corresponding to the SEQ ID NO:X will yield an 
amplified fragment. 

Similarly, somatic hybrids provide a rapid method of PGR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides 
can be achieved with panels of specific chromosome fragments. Other gene mapping 
strategies that can be used include in situ hybridization, prescreening with labeled 
flow-sorted chromosomes, and preselection by hybridization to construct 
chromosome specific-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved 
using fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. 
This technique uses polynucleotides as short as 500 or 600 bases: however, 
polynucleotides 2,000-4,000 bp are preferred. For a review of this technique, see 
Vermaet al., "Human Chromosomes: a iManual of Basic Techniques/' Pergamon 
Press, New York (1988). 

For chroinosome mapping, the polynucleotides can be used individually (to 
mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, 
the physical position of the polynucleotide can be used in linkage analysis. Linkage 
analysis establishes coinheritance between a chromosomal location and presentation 
of a panicular disease. (Disease mapping data are found, for example, in V. 
McKusick. Mendelian Inheritance in Man (available on line through Johns Hopkins 
University Welch Medical Librar}-j .) Assuming 1 megabase mappmg resolution and 
one gene per 20 kb, a cDNA precisely localized to a chromosomal region associated 
with the di.sea.se could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 
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translocations, are examined in chromosome spreads or by PCR. If no structural 
alterations exist, the presence of point mutations are ascertained. Mutations observed 
in some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete .sequencing of the polypeptide 
5 and the corresponding gene from several normal individuals is required to distinguish 
the mutation from a polymorphism. If a new polymorphism is identified, this 
polymorphic polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
10 polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 
marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DNA or RNA. Both methods 

15 rely on binding of the polynucleotide to DNA or RNA. For these techniques, 

preferred polynucleotides are usually 20 to 40 bases in length and complementary to 
either the region of the gene involved in transcription (triple helix - see Lee et al., 
Nucl. Acids Res. 6:3073 (1979): Cooney et al.. Science 241:456 (1988); and Dervan 
et al., Science 251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. 

20 Neurochem. 56:560 (1991): Oligodeoxy-nucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press. Boca Raton, FL ( 1988).) Triple helix formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA 
hybridization blocks translation of an mRNA molecule into polypeptide. Both 
techniques are effective in model systems, and the information disclosed herein can 

25 be used to design antisense or triple helix polynucleotides in an effort to treat disease. 
Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 
gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a highly accurate 

30 manner. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 
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The polynucleotides are also useful for identifying individuals from minute 
biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. 
In this technique, an individual's genomic DNA is digested with one or more 
5 restriction enzymes, and probed on a Southern blot to yield unique bands for 

identifying personnel. This method does not suffer from the current limitations of 
"Dog Tags" which can be lost, switched, or stolen, making positive identification 
difficult. The polynucleotides of the present invention can be used as additional DNA 
markers for RFLP. 

10 The polynucleotides of the present invention can also be used as an alternative 

to RPT-P, by determining the actual base-by-base DNA sequence of selected portions 
of an individual's genome. These sequences can be used to prepare PCR primers for 
amplifying and isolating such selected DNA, which can then be sequenced. Using 
this technique, individuals can be identified because each individual will have a 

15 unique set of DNA sequences. Once an unique ID database is established for an 

individual, positive identification of that individual, living or dead, can be made from 
extremely small tissue samples. 

Forensic biology also benefits from using DNA-based identification 
techniques as disclosed herein. DNA sequences taken from very small biological 

20 samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, 
etc., can be amplified using PCR. In one prior art technique, gene sequences 
amplified from polymorphic loci, such as DQa class II HLA gene, are used in forensic 
biology to identify individuals. (Erlich, H.. PCR Technology, Freeman and Co. 
(1992).) Once these specific polymorphic loci arc amplified, they are digested with 

25 one or more restriction enzymes, yielding an identifying set of bands on a Southern 
blot probed with DNA corresponding to the DQa class II HLA gene. Similarly, 
polynucleotides of the present invention can be used as polymorphic markers for 
forensic purposes. 

There is also a need for reagents capable of identifying the source of a 

30 particular tissue. Such need arises, for example, in forensics when presented with 
tissue of unknown origin. Appropriate reagents can comprise, for example, DNA 
probes or primers specific to particular tissue prepared from the sequences of the 
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present invention. Panels of such reagents can identify tissue by species and/or by 
organ type. In a similar fashion, these reagents can be used to screen tissue cultures 
for contamination. 

In the very least, the polynucleotides of the present invention can be used as 
5 molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making 
oligomers for attachment to a "gene chip" or other support, to raise anti-DNA 
antibodies using DNA immunization techniques, and as an antigen to elicit an 
10 immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
following description should be considered exemplary and utilizes known techniques. 

15 A polypeptide of the present invention can be used to assay protein levels in a 

biological sample using antibody-based techniques. For example, protein expression 
in tissues can be studied with classical immunohistological methods. (Jalkanen, M., 
et al., J. Cell. Biol. 101:976-985 (1985): Jalkanen. M.. et al., J. Cell . Biol. 105:3087- 
3096 (1987).) Other antibody-based methods useful for detecting protein gene 

20 expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the an and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S). tritium f3H). indium (1 12In), and 
technetium (99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 

25 biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
radiography, suitable labels include radioisotopes such as barium or cesium, which 
30 emit delectable radiation but are not overtly harmful to the subject. Suitable markers 
for NMR and ESR include those with a detectable characteristic spin, such as 



wo 00/06698 



285 



PCT/US99/17130 



deuterium, which may be incorporated into the antibody by labeling of nutrients for 
the relevant hybridoma. 

A protein-specific antibody or antibody fragment which has been labeled with 
an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear 
magnetic resonance, is introduced (for example, parenterally, subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the art that the size of the 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a 
human subject, the quantity of radioactivity injected will noiTOally range from about 5 
to 20 millicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. 
In vivo tumor imaging is described in S.W. Burchiel et al., "Immunopharmacokinetics 
of Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: 
The Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., 
Masson Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which 
involves (a) assaying the expression of a polypeptide of the present invention in cells 
or body fluid of an individual; (b) comparing the level of gene expression with a 
standard gene expression level, whereby an increase or decrease in the assayed 
polypeptide gene expression level compared to the standard expression level is 
indicative of a disorder. 

Moreover polypeptides of the present invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 
supplement absent or decreased levels of a different polypeptide (e.g.. hemoglobin S 
for hemoglobin B). to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 
response (e.g.. blood vessel growth). 
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Similarly, antibodies directed to a polypeptide of the present invention can 
also be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, 
5 such as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from 
10 a recombinant cell, as a way of assessing transformation of the host cell. Moreover, 
the polypeptides of the present invention can be used to test the following biological 
activities. 

Biological Activities 

15 The polynucleotides and polypeptides of the present invention can be used in 

assays to test for one or more biological activities. If these polynucleotides and 
polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

20 

Immune Activity 

A polypeptide or polynucleotide of the present invention may be useful in 
treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 

25 cells develop through a process called hematopoiesis, producing myeloid (platelets, 
red blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) 
cells from pluripoient stem cells. The etiology of the.se immune deficiencies or 
disorders may be genetic, somatic, such as cancer or some autoimmune disorders, 
acquired (e.g.. by chemotherapy or toxins), or infectious. Moreover, a polynucleotide 

30 or polypeptide of the present invention can be used as a marker or detector of a 
particular immune system disease or disorder. 
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A polynucleotide or polypeptide of the present invention may be useful in 
treating or detecting deficiencies or disorders of hematopoietic cells. A 
polypeptide or polynucleotide of the present invention could be used to increase 
differentiation and proliferation of hematopoietic cells, including the pluripotent stem 
5 cells, in an effort to treat those disorders associated with a decrease in certain (or 
many) types hematopoietic cells. Examples of immunologic deficiency syndromes 
include, but are not limited to: blood protein disorders (e.g. agammaglobulinemia, 
dysgammaglobulinemia), ataxia telangiectasia/common variable immunodeficiency, 
Digeorge Syndrome, HIV infection, HTLV-BLV infection, leukocyte adhesion 

10 deficiency syndrome, lymphopenia, phagocyte bactericidal dysfunction, severe 
combined immunodeficiency (SCIDs), Wiskott-Aldrich Disorder, anemia, 
thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity 

15 (clot formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 
coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 
disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgerj'. or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that 

20 can decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
treating or detecting autoimmune disorders. Many autoimmune disorders result from 

25 inappropriate recognition of self as foreign material by immune cells. This 

inappropriate recognition results in an immune response leading to the destruction of 
the host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 
differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 

30 autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the 
present invention include, but are not limited to: Addi.son's Di.sease. hemolytic 
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anemia, antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic 
encephalomyelitis, glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, 
3; Multiple Sclerosis, Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, 

Pemphigus. Polyendocrinopaihies, Purpura, Reiter's Disease, Stiff-Man Syndrome, 
5 Autoimmune Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary 
Inflammation, Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and 
autoimmune inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly 
allergic asthma) or other respiratory problems, may also be treated by a polypeptide 
10 or polynucleotide of the present invention. Moreover, these molecules can be used to 
treat anaphylaxis, hypersensitivity to an antigenic molecule, or blood group 
incompatibility. 

A polynucleotide or polypeptide of the present invention may also be used to 
11 treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ 

■4' 15 rejection occurs by host immune cell destruction of the transplanted tissue through an 

inmiune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 
administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of 
20 T-cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the pre.sent invention may also 
be used to modulate inflammation. For example, the polypeptide or polynucleotide 
may inhibit the proliferation and differentiation of cells involved in an inflammatory 
response. These molecules can be used to treat inflammatory conditions, both chronic 
25 and acute conditions, including inflammation associated with infection (e.g., septic 
shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injur>', innammatory bowel 
disease, Crohn's disease, or resulting from over production of cytokines (e.g., TNF or 
30 IL-I.) 

Hvperproliferative Disorders 
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A polypeptide or polynucleotide can be used to treat or detect 
hyperproliferative disorders, including neoplasms. A polypeptide or polynucleotide 
of the present invention may inhibit the proliferation of the disorder through direct or 
indirect interactions*. Alternatively, a polypeptide or polynucleotide of the present 
invention may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, i)y increasing an immune response, panicularly increasing 
antigenic qualities of the hyperproliferative disorder or by proliferating, 
differentiating, or mobilizing T-cells, hyperproliferative disorders can be treated. 
This immune response may be increased by either enhancing an existing immune 
response, or by initiating a new immune response. Alternatively, decreasing an 
immune response may also be a method of treating hyperproliferative disorders, such 
as a chemotherapeutic agent. 

Examples of hyperproliferative disorders that can be treated or detected by a 
polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 
pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by 
a polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: 

hypergammaglobulinemia, lymphoproliferative disorders, paraproteinemias, purpura, 
sarcoidosis, Sezary Syndrome. Waldenstron's Macroglobulinemia, Gaucher's 
Di.sease, histiocytosis, and any other hyperproliferative disease, besides neoplasia, 
located in an organ system listed above. 

Infectious Diseasp 

A polypeptide or polynucleotide of the present invention can be used to treat 
or detect infectious agents. For example, by increasing the immune response, 
particularly increasing the proliferation and differentiation of B and/or T cells, 
infectious diseases may be treated. The immune response may be increased by either 
enhancing an existing immune response, or by initiating a new immune response. 
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Alternatively, the polypeptide or polynucleotide of the present invention may also 
directly inhibit the infectious agent, without necessarily eliciting an immune response. 

Viruses are one example of an infectious agent that can cause disease or 
symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
5 present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
Bimaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Flaviviridae, 
Hepadnaviridae (Hepatitis), Herpesviridae (such as, Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g.. Paramyxoviridae, Morbillivirus, 

10 Rhabdoviridae), Orthomyxoviridae (e.g.. Influenza), Papovaviridae, Parvoviridae, 
Picomaviridae, Poxviridae (such as Smallpox or Vaccinia). Reoviridae (e.g.. 
Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivirus), and Togaviridae (e.g., 
Rubivirus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchiollitis, encephalitis, eye 

15 infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E. Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever. Measles, Mumps, 
Parainfluenza, Rabies, the common cold. Polio, leukemia. Rubella, sexually 
transmitted diseases, skin diseases (e.g.. Kaposi's, warts), and viremia. A polypeptide 

20 or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and 
that can be treated or detected by a polynucleotide or polypeptide of the present 
invention include, but not limited to. the following Gram-Negative and Gram-positive 

25 bacterial families and fungi: Actinomycetales (e.g.. Cor\nebacterium, 

Mycobacterium, Norcardia), Aspergillosis. Bacillaceae le.g.. Anthrax. Clostridium). 
Bacteroidaceae. Blastomycosis. Bordeteila. Borrelia, Brucellosis. Candidiasis, 
Campylobacter, Coccidioidomycosis. Cr>'ptococcosis. Dermatocycoses, 
Enterobacteriaceae (Klebsiella, Salmonella, Serratia. Yersinia), Erysipelothrix, 

30 Helicobacter, Legionellosis, Leptospirosis. Listeria, Mycoplasmatales. Neis.scriaceae 
(e.g., Acinetobacier, Gonorrhea, Menigococcal). Pasieurellacea Infections (e.g., 
Actinobacillus, Heamophilus, Pasteurella). P.seudomdnas, Rickettsiaceae, 
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Chlamydiaceae, Syphilis, and Staphylococcal. These bacterial or fungal families can 
cause the following diseases or symptoms, including, but not limited to: bacteremia, 
endocarditis, eye infections (conjunctivitis, tuberculosis, uveitis), gingivitis, 
opportunistic infections (e.g., AIDS related infections), paronychia, prosthesis-related 
infections, Reiter\s Disease, respiratory tract infections, such as Whooping Cough or 
Empyema, sepsis, Lyme Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, 
food poisoning, Typhoid, pneumonia. Gonorrhea, meningitis. Chlamydia, Syphilis, 
Diphtheria, Leprosy, Paratuberculosis, Tuberculosis, Lupus. Botulism, gangrene, 
tetanus, impetigo. Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin 
diseases (e.g., cellulitis, dermaiocycoses), toxemia, urinary tract infections, wound 
infections. A polypeptide or polynucleotide of the present invention can be used to 
treat or detect any of these symptoms or diseases. 

Moreover, parasitic agents causing disease or symptoms that can be treated or 
detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to. the following families: Amebiasis, Babesiosis. Coccidiosis, 
Cryptosporidiosis. Dientamoebiasis, Dourine. Ectoparasitic. Giardiasis, 
Helminthiasis, Leishmaniasis, Theileriasis, Toxoplasmosis. Trypanosomiasis, and 
Trichomonas. These parasites can cause a variety of diseases or symptoms, including,, 
but not limited to: Scabies, Trombiculiasis, eye infections, intestinal disease (e.g., 
dysentery, giardiasis), liver disease, lung disease, opportunistic infections (e.g.. AIDS 
related). Malaria, pregnancy complications, and toxoplasmosis. .A polypeptide or 
polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a 
polynucleotide of the present invention, and returning the engineered cells to the 
patient (ex vivo therapy). Moreover, the polypeptide or polynucleotide of the present 
invention can be used as an antigen in a vaccine to rai.se an immune response against 
infectious disease. 



Regeneration 
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A polynucleotide or polypeptide of the present invention can be used to 
differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, bums, 
5 incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 
disease, liver failure), surgery, including cosmetic plastic surgery, fibrosis, 
reperfusion injury, or systemic cytokine damage. . 

Tissues that could be regenerated using the present invention include organs 
(e.g., pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal 

10 or cardiac), vasculature (including vascular and lymphatics), nervous, hematopoietic, 
and skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration 
occurs without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may 
increase regeneration of tissues difficult to heal. For example, increased 

15 tendon/ligament regeneration would quicken recovery time after damage. A 
polynucleotide or polypeptide of the present invention could also be used 
prophylactically in an effort to avoid damage. Specific diseases that could be treated 
include of tendinitis, carpal tunnel syndrome, and other tendon or ligament defects. A 
further example of tissue regeneration of non-healing wounds includes pressure 

20 ulcers, ulcers associated with vascular insufficiency, surgical, and traumatic wounds. 
Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 
peripheral nervous system diseases, neuropathies, or mechanical and traumatic 

25 disorders (e.g., spinal cord disorders, head trauma, cerebrova.scular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g.. resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 
Parkinson's disease, Huntington's di.sease, amyotrophic lateral sclerosis, and Shy- 

30 Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 
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Chemotaxis 

A polynucleotide or polypeptide of the present invention may have 
chemotaxis activity. A chemoiaxic molecule attracts or mobilizes cells (e.g., 
monocytes, fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or 
endothelial cells) to a particular site in the body, such as inflammation, infection, or 
site of hyperproliferation. The mobilized cells can then fight off and/or heal the 
panicular trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used 
to treat inflammation, infection, hyperproliferative disorders, or any immune system 
disorder by increasing the number of cells targeted to a particular location in the body. 
For example, chemotaxic molecules can be used to treat wounds and other trauma to 
tissues by attracting immune cells to the injured location. Chemotactic molecules of 
the present invention can also attract fibroblasts, which can be used to treat wounds. 

It is also contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to 
treat disorders. Thus, a polynucleotide or polypeptide of the present invention could 
be used as an inhibitor of chemotaxis. 

Binding Artiv^fy 

A polypeptide of the present invention may be used to screen for molecules 
that bind to the polypeptide or for molecules to which the polypeptide binds. The 
binding of the polypeptide and the molecule may activate (agonist), increase, inhibit 
(antagonist); or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g.. receptors),or 
small molecules. 

Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand. a structural 
or functional mimetic. (See, Coligan et al.. Current Protocols in Immunology 
U2):Chapter 5 (1991 ).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
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of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

Preferably, the screening for these molecules involves producing appropriate 
cells which express the polypeptide, either as a secreted protein or on the cell 
5 membrane. Preferred cells include cells from mammals, yeast, Drosophila, or £. colL 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially 
containing the molecule to observe binding, stimulation, or inhibition of activity of 
either the polypeptide or the molecule. 

10 The assay may simply test binding of a candidate compound to the 

polypeptide, wherein binding is detected by a label, or in an assay involving 
competition with a labeled competitor. Funher. the assay may test whether the 
candidate compound results in a signal generated by binding to the polypeptide. 
Alternatively, the assay can be carried out using cell-free preparations, 

15 polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 
activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 

20 Preferably, an ELISA assay can measure polypeptide level or activity in a 

sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 
All of these above assays can be used as diagnostic or prognostic markers. 

25 The molecules discovered using these assays can be used to treat disease or to bring 
about a particular result in a patient (e.g.. blood ves.sel growth) by activating or 
inhibiting the polypeptide/molecule. Moreover, the assays can discover agents which 
may inhibit or enhance the production of the polypeptide from suitably manipulated 
cells or tissues. 

30 Therefore, the invention includes a method of identifying compounds which 

bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
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candidate binding compound with a polypeptide of the invention; and (b) determining 
if binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound 
u^ith a polypeptide of the invention, (b) assaying a biological activity , and (b) 
determining if a biological activity of the polypeptide has been altered. 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
decrease the differentiation or proliferation of embryonic stem cells, besides, as 
discussed above, hematopoietic lineage. 

A polypeptide or polynucleotide of the present invention may also be used to 
modulate mammalian characteristics, such as body height, weight, hair color, eye 
color, skin, percentage of adipose tissue, pigmentation, size, and shape (e.g.. cosmetic 
surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 
used to modulate mammalian metabolism affecting catabolism, anabolism, 
processing, utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to 
change a mammaFs mental state or physical state by influencing biorhyihms, 
caricadic rhythms, depression (including depressive disorders), tendency for violence, 
tolerance for pain, reproductive capabilities (preferably by Aciivin or Inhibin-like 
activity), hormonal or endocrine levels, appetite, libido, memory, stress, or other 
cognitive qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
food additive or preservative, such as to increa.se or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofaciors or other nutritional 
components. 

Other Pr eferred Fmbodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% 
identical to a sequence of at least about 50 contiguous nucleotides in the nucleotide 
sequence of SEQ ID NO:X wherein X is any integer as defined in Table 1. 
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Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of 
the Clone Sequence and ending with the nucleotide at about the position of the 3' 
5 Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the poisition of the 5* Nucleotide of 
the Start Codon and ending with the nucleotide at about the position of the 3' 
10 Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5' 
Nucleotide of the First Amino Acid of the Signal Peptide and ending with the 
15 nucleotide at about the position of the 3' Nucleotide of the Clone Sequence as defined 
for SEQ ID NO:X in Table 1 . 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 150 
contiguous nucleotides in the nucleotide sequence of SEQ ID NO:X. 
20 Further preferred is an isolated nucleic acid molecule comprising a nucleotide 

sequence which is at least 95% identical to a sequence of at least about 500 
contiguous nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 
nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
25 ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of 
the First Amino Acid of the Signal Peptide and ending with the nucleotide at about 
the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID NO:X 
in Table 1. 

A further preferred embodiment is an isolated nucleic acid molecule 
30 comprising a nucleotide sequence which is at least 95% identical to the complete 
nucleotide sequence of SEQ ID NO:X. 
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Also preferred is an isolated nucleic acid molecule which hybridizes under 
stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic 
acid molecule which hybridizes does not hybridize under stringent hybridization 
conditions to a nucleic acid molecule having a nucleotide sequence consisting of only 
A residues or of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1, 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least 50 contiguous 
nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1, which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1. 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of 
at least 50 contiguous nucleotides is included in the nucleotide sequence of the 
complete open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule 
comprising a nucleotide sequence which is at least 95% identical to sequence of at 
least 500 contiguous nucleotides in the nucleotide sequence encoded by said human 
cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule 
comprising a nucleotide sequence which is at least 95% identical to the complete 
nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological 
sample a nucleic acid molecule comprising a nucleotide sequence which is at least 
95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 
selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X 
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wherein X is any integer as defined in Table 1; and a nucleotide sequence encoded by 
a. human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 
1 ; which method comprises a step of comparing a nucleotide sequence of at least one 
5 nucleic acid molecule in said sample with a sequence selected from said group and 
determining whether the sequence of said nucleic acid molecule in said sample is at 
least 95% identical to said selected sequence. 

Also preferred is the above method wherein said step of comparing sequences 
comprises determining the extent of nucleic acid hybridization between nucleic acid 

10 molecules in said sample and a nucleic acid molecule comprising said sequence 

selected from said group. Similarly, also preferred is the above method wherein said 
step of comparing sequences is performed by comparing the nucleotide sequence 
determined from a nucleic acid molecule in said sample with said sequence selected 
from said group. The nucleic acid molecules can comprise DNA molecules or RNA 

15 molecules. 

A further preferred embodiment is a method for identifying the species, tissue 
or cell type of a biological sample which method comprises a step of detecting nucleic 
acid molecules in said sample, if any, comprising a nucleotide sequence that is at least 
95;% identical to a sequence of at least 50 contiguous nucleotides in a sequence 
20 selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X 

wherein X is any integer as defined in Table 1; and a nucleotide sequence encoded by 
a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 
1. 

25 The method for identifying the species, tissue or cell type of a biological 

sample can comprise a step of detecting nucleic acid molecules comprising a 
nucleotide sequence in a panel of at least two nucleotide sequences, wherein at least 
one sequence in said panel is at least 957c identical to a sequence of at least 50 
contiguous nucleotides in a sequence selected from said group. 

•^0 Also preferred is a method for diagnosing in a subject a pathological condition 

associated with abnormal structure or expression of a gene encoding a secreted 
protein identified in Table 1, which method comprises a step of detecting in a 
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biological sample obtained from said subject nucleic acid molecules, if any. 

comprising a nucleotide sequence that is at least 95% identical to a sequence of at 

least 50 contiguous nucleotides in a sequence selected from the group consisting of: a 

nucleotide sequence of SEQ ID NO:X wherein X is any integer as defined in Table 1; 
5 and a nucleotide sequence encoded by a human cDNA clone identified by a cDNA 

Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit 

Number shown for said cDNA clone in Table 1. 

The method for diagnosing a pathological condition can comprise a step of 

detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at 
10 least two nucleotide sequences, wherein at least one sequence in said panel is at least 

95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 

selected from said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 

molecules wherein the nucleotide sequences of said nucleic acid molecules comprise 
15 a panel of at least two nucleotide sequences, wherein at least one sequence in said 

panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides in a 

sequence selected from the group consisting of: a nucleotide sequence of SEQ ID 

NO:X wherein X is any integer as defined in Table 1: and a nucleotide sequence 

encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 
20 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 

clone in Table 1 . The nucleic acid molecules can comprise DNA molecules or RNA 

molecules- 

Also preferred is an isolated polypeptide comprising an amino acid sequence 
at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
25 amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1. 

Also preferred is a polypeptide, wherein said sequence of contiguous amino 
acids is included in the amino acid sequence of SEQ ID NO: Y in the range of 
positions beginning with the residue at about the position of the First Amino Acid of 
the Secreted Portion and ending with the residue at about the Last Amino Acid of the 
30 Open Reading Frame as set forth for SEQ ID NO: Y in Table 1 . 
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Also preferred is an isolated polypeptide comprising an amino acid sequence 
at least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid 
5 sequence at least 95% identical to a sequence of at least about 100 contiguous amino 
acids in the amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid 
sequence at least 95% identical to the complete amino acid sequence of SEQ ID 
NO:Y. 

10 Further preferred is an isolated polypeptide comprising an amino acid 

sequence at least 90% identical to a sequence of at least about 10 contiguous amino 
acids in the complete amino acid sequence of a secreted protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

15 Also preferred is a polypeptide wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of a secreted portion of the secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDNA clone in Table 1. 

20 Also preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit 
with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

25 Also preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 100 contiguous amino acids in 
the amino acid sequence of the secreted ponion of the protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. 

30 Also preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDN A clone identified by a cDNA Clone Identifier in Table 1 
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and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1 . 

Further preferred is an isolated antibody which binds specifically to a 
polypeptide comprising an amino acid sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 
defined in Table 1: and a complete amino acid sequence of a protein encoded by a 
human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 
1. 

Further preferred is a method for detecting in a biological sample a 
polypeptide comprising an amino acid sequence which is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 
defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 
human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 
1; which method comprises a step of comparing an amino acid sequence of at least 
one polypeptide molecule in said sample with a sequence selected from said group 
and determining whether the sequence of said polypeptide molecule in said sample is 
at least 90% identical to said sequence of at least 10 contiguous amino acids. 

Also preferred is the above method wherein said step of comparing an amino 
acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 
polypeptides in said sample to an antibody which binds specifically to a polypeptide 
comprising an amino acid sequence that is at least 90% identical to a sequence of at 
least 10 connguous amino acids in a sequence selected from the group consisting of: 
an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in 
Table 1; and a complete amino acid sequence of a protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit 
with the ATCC Deposit Number shown for said cDNA clone in Table 1. 
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Also preferred is the above method wherein said step of comparing sequences 
is performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 

Also preferred is a method for identifying the species, tissue or cell type of a 
5 biological sample which method comprises a step of detecting polypeptide molecules 
in said sample, if any, comprising an amino acid sequence that is at least 90% 
identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
from the group consisting of: an amino acid sequence of SEQ ID NO:Y wherein Y is 
any integer as defined in Table I ; and a complete amino acid sequence of a secreted 
10 protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDNAclone in Table 1. 

Also preferred is the above method for identifying the species, tissue or cell 
type of a biological sample, which method comprises a step of detecting polypeptide 
15 molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the above 
group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
20 associated with abnormal structure or expression of a gene encoding a secreted 
protein identified in Table 1, which method comprises a step of detecting in a 
biological sample obtained from said subject polypeptide molecules comprising an 
amino acid sequence in a panel of at least two amino acid sequences, wherein at least 
one sequence in said panel is at least 90% identical to a sequence of at least 10 
25 contiguous amino acids in a sequence selected from the group consisting of: an amino 
acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; and a 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1. 
30 In any of these methods, the step of detecting said polypeptide molecules 

includes using an antibody. 
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Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at 
least 90% identical to a sequence of at least 10 contiguous amino acids in a sequence 
5 selected from the group consisting of: an amino acid sequence of SEQ ID NO:Y 
wherein Y is any integer as defined in Table 1; and a complete amino acid sequence 
of a secreted protein encoded by a human cDNA clone identified by a cDNA Clone 
Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table 1. 

10 Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 

sequence encoding a polypeptide has been optimized for expression of said 
polypeptide in a prokaryotic host. 

Also preferred is an isolated nucleic acid molecule, wherein said polypeptide 
comprises an amino acid sequence selected from the group consisting of: an amino , 

15 acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1: and a 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 

20 inserting any of the above isolated nucleic acid molecule into a vector. Also preferred 
is the recombinant vector produced by this method. Also preferred is a method of 
making a recombinant host cell comprising introducing the vector into a host cell, as 
well as the recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 

25 culluring this recombinant host cell under conditions such that said polypeptide is 
expressed and recovering said polypeptide. Also preferred is this method of making 
an isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and 
said polypeptide is a secreted portion of a human secreted protein comprising an 
amino acid sequence selected from the group consisting of: an amino acid sequence of 

30 SEQ ID NO:Y beginning with the residue at the position of the First Amino Acid of 
the Secreted Ponion of SEQ ID NO: Y wherein Y is an integer set fonh in Table 1 and 
said position of the First Amino Acid of the Secreted Portion of SEQ ID NO: Y is 
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defined in Table 1 ; and an aniino acid sequence of a secreted portion of a protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 
and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1. The isolated polypeptide produced by this method is also preferred. 

Also preferred is a method of treatment of an individual in need of an 
increased level of a secreted protein activity, which method comprises administering 
to such an individual a pharmaceutical composition comprising an amount of an 
isolated polypeptide, polynucleotide, or antibody of the claimed invention effective to 
increase the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way of 
illustration and are not intended as limiting. 



Example 1: Isolation of a Selected cDNA Clone From the Deposited Sample 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
Table 1 identifies the vectors used to construct the cDNA librar>' from which each 
clone was isolated. In many cases, the vector used to construct the library is a phage 
vector from which a plasmid has been excised. The table immediately below 
correlates the related plasmid for each phage vector used in constructing the cDNA 
library. For example, where a particular clone is identified in Table 1 as being 
isolated in the vector "Lambda Zap," the corresponding deposited clone is in 
"pBluescript." 

Vector Used to Construct Librarv Corresponding Deposited 



Examples 



Plasmid 



Lambda Zap 
Uni-Zap XR 
Zap Express 



pBluescript (pBS) 
pBIuescnpt (pBS) 
pBK 

plafmid BA 
pSportl 

pCMVSport 2.0 



lafmid BA 



pSportl 

pCMVSpori 2.0 
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pCMVSport 3.0 
pCR^2.1 



pCMVSport 3.0 
pCR®2.1 



Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 
XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al., Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) and pBK (Alting-Mees, M. A. et al.. Strategies 5:58-61 (1992)) are 
commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey 
Pines Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and 
pBK contains a neomycin resistance gene. Both can be transformed into E. coli strain 
XL-1 Blue, also available from Stratagene. pBS comes in 4 forms SK+. SK-, KS+ 
and KS. The S and K refers to the orientation of the polylinker to the T7 and T3 
primer sequences which flank the polylinker region ("S" is for Sad and "K" is for 
Kpnl which are the first sites on each respective end of the linker). or refer to 
the orientation of the fl origin of replication ("ori**), such that in one orientation, 
single stranded rescue initiated from the f 1 ori generates sense strand DNA and in the 
other, antisense. 

Vectors pSportL pCMVSport 2.0 and pCMVSport 3.0. were obtained from 
Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. All Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coli strain 
DHIOB, also available from Life Technologies. (See, for instance. Gruber. C. E., et 
al.. Focus 15:59 (1993).) Vector lafmid BA (Bento Soares. Columbia University, 
NY) contains an ampicillin resistance gene and can be transformed into E. coli strain 
XL-1 Blue. Vector pCR^2.1, which is available from Invitrogen, 1600 Faraday 
Avenue, Carlsbad, CA 92008, contains an ampicillin resistance gene and may be 
transformed into E. coli strain DHIOB, available from Life Technologies. (See, for 
instance, Clark, J. M.. Nuc. Acids Res. 16:9677-9686 (1988) and Mead. D. et al., 
Bio/Technology 9: (1991 ).) Preferably, a polynucleotide of the present invention 
does not comprise the phage vector sequences identified for the particular clone in 
Table I, as well as the corresponding plasmid vector sequences designated above. 

The deposited material in the sample assigned the ATCC Deposit Number 
cited in Table 1 for any given cDNA clone al.so may contain one or more additional 
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plasmids, each comprising a cDNA clone different from that given clone. Thus, 
deposits sharing the same ATCC Deposit Number contain at least a plasmid for each 
cDNA clone identified in Table 1. Typically* each ATCC deposit sample cited in 
Table 1 comprises a mixture of approximately equal amounts (by weight) of about 50 
5 plasmid DNAs, each containing a different cDNA clone; but such a deposit sample 
may include plasmids for more or less than 50 cDNA clones, up to about 500 cDNA 
clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample df plasmid DNAs cited for that clone in Table 1. First, a plasmid is directly 
10 isolated by screening the clones using a polynucleotide probe corresponding to SEQ 
ID NO:X. 

Particularly, a specific polynucieotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 
The oligonucleotide is labeled, for instance, with • -P-y-ATP using T4 polynucleotide 

15 kinase and purified according to routine methods. (E.g., Maniatis et al.. Molecular 
Cloning: A Laboratory Manual. Cold Spring Harbor Press, Cold Spring, NY (1982).) 
The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL-1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 
those provided by the vector supplier or in related publications or patents cited above. 

20 The transformanis are plated on 1.5% agar plates (containing the appropriate selection 
agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are .screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al.. Molecular Cloning: A Laboratory 
Manual. 2nd Edit., (1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 

25 1 . 104), or other techniques known to those of skill in the art. 

Alternatively, two primers of 17-20 nucleotides derived from both ends of the 
SEQ ID NO:X (i.e.. within the region of SEQ ID NO:X bounded by the 5' NT and the 
3' NT of the clone defined in Table 1) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain 

30 reaction is carried out under routine conditions, for instance, in 25 ul of reaction 

0 

mixture with 0.5 ug of the above cDN A template. A convenient reaction mixture is 
1.5-5 mM MgCK, 0.01% (w/v) gelatin, 20 ^iM each of dATP, dCTP. dCTP, dTTP, 25 
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pmoJ of each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR 
(denaturaiion at 94°C for 1 min: annealing at SS^'C for 1 min: elongation at 72°C for I 
min) are performed with a Perkin-Elmer Cetus automated thermal cycler. The 
amplified product is analyzed by agarose gel electrophoresis and the DNA band with 
expected molecular weight is excised and purified. The PCR product is verified to be 
the selected sequence by subcloning and sequencing the DNA product. 

. Several methods are available for the identification of the 5' or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 
include but are not limited to, filter probing, clone enrichment using specific probes, 
and protocols similar or identical to 5' and 3' "RACE" protocols which are well 
known in the an. For instance, a method similar to 5' RACE is available for 
generating the missing 5' end of a desired full-length transcript. (Fromont-Racine et 
al.. Nucleic Acids Res. 21(7): 1683-1684 (1993).) 

Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a 
population of RNA presumably containing full-length gene RNA transcripts. A 
primer set containing a primer specific to the ligated RNA oligonucleotide and a 
primer specific to a known sequence of the gene of interest is used to PCR amplify 
the 5' portion of the desired full-length gene. This amplified product may then be 
sequenced and used to generate the full length gene. 

This above method sians with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphata.se if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA ligase step. The phosphatase should 
then be inactivated and the RNA treated with tobacco acid pyrophosphatase in order 
to remove the cap structure present at the 5* ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5' end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 

This modified RNA preparation is used as a template for first strand cDNA 
synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PCR amplification of the desired 5' end using a primer specific 
to the ligated RNA oligonucleotide and a primer specific to the known .sequence of 
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the gene of interest. The resultant product is then sequenced and analyzed to confinn 
that the 5' end sequence belongs to the desired gene. 

Example 2; I^olatipn of (genomic C|oyies Corresponding to a Polvfiuc|eotide 
5 A human genomic PI library (Genomic Systems, Inc.) is screened by PCR 

using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
.^according to the method described in Example 1. (See also, Sambrook.) 

- Example 3: Tissue Distribution of Polypeptide 
10 Tissue distribution of mRNA expression of polynucleotides of the present 

invention is determined using protocois for Northern blot analysis, described by, 
among others, Sambrook et al. For example, a cDNA probe produced by the method 
described in Example 1 is labeled with P'' using the rediprime^^* DNA labeling 
system (Amersham Life Science), according to manufacturer's instructions. After 
15 labeling, the probe is purified using CHROiMA SPIN- 100^'^* column (Clontech 
Laboratories, Inc.), according to manufacturer's protocol number PT 1200-1. The 
purified labeled probe is then used to examine various human tissues for mRNA 
expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) 
20 or human immune system tissues (IM) (Clontech) are examined with the labeled 
probe using Express Hyb^*^* hybridization solution (Clontech) according to 
manufacturer's protocol number PTl 190-1. Following hybridization and washing, the 
blots are mounted and expo.sed to film at -70°C ovemight. and the films developed 
according to standard procedures. 

25 

Example 4: Chromosomal Mapping of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5' 
end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
primer set is then used in a polymerase chain reaction under the following set of 
30 conditions : 30 seconds, 95''C: 1 minute, 56°C; 1 minute, TO'^C. This cycle is 
repeated 32 times followed by one 5 minute cycle at 70°C. Human, mouse, and 
hamster DNA is used as template in addition to a somatic cell hybrid panel containing 
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individual chromosomes or chromosome fragments (Bios, Inc). The reactions is 
analyzed on either 8% polyacrylamide gels or 3.5 % agarose gels. Chromosome 
mapping is determined by the presence of an approximately 100 bp PCR fragment in 
the particular somatic cell hybrid. 

Example 5: Bacterial Expression of a Polypeptide 

A polynucleotide encoding a polypeptide of the present invention is amplified . 
using PCR oligonucleotide primers corresponding to the 5' and 3* ends of the DNA 
sequence, as outlined in Example I , to synthesize insertion fragments. The primers 

used to amplify the cDNA insen should preferably contain restriction sites, such as 
BamHI and XbaL at the 5' end of the primers in order to clone the amplified product 
into the expression vector. For example, BamHI and Xbal correspond to the 
restriction enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., 

Chatsworth, CA). This plasmid vector encodes antibiotic resistance (AmpO, a 
bacterial origin of replication (ori), an IPTG-regulatable promoter/operator (P/O), a 
ribosome binding site (RBS), a 6-histidine tag (6-His), and restriction enzyme cloning 
sites. 

The pQE-9 vector is digested with BamHI and Xbal and the amplified 
fragment is ligated into the pQE-9 vector maintaining the reading frame initiated at 
the bacterial RBS. The ligation mixture is then used to transform the E. coli strain 
M15/rep4 (Qiagen. Inc.) which contains multiple copies of the plasmid pREP4, which 

expresses the lad repressor and also confers kanamycin resistance f Kan^). 
Transformants are identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies are selected. Plasmid DNA is isolated and 
confirmed by restriction analysis. 

Clones containing the desired constructs arc grown overnight (0/N) in liquid 
culture in LB media supplemented with both Amp ( 100 ug/ml) and Kan (25 ug/ml). 
The 0/N culture is used to inoculate a large culture at a ratio of 1 : 100 to 1 :250. The 
cells are grown to an optical density 600 (O.D.^*'^*) of between 0.4 and 0.6. IPTG 
(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of 1 
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. mM. IPTG induces by inactivating the lad repressor, clearing the P/O leading to 
increased gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
centrifugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 
agent 6 Molar Guanidine HCl by stirring for 3-4 hours at 4°C. The cell debris is 
removed by centrifugation. and the supernatant containing the polypeptide is loaded 
:onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with 
high affinity and can be purified in a simple one-step procedure (for details see: The 
QIAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 
8, the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then 
washed with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is 
eluted with 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate- 
buffered saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCI. 
Alternatively, the protein can be successfully refolded while immobilized on the Ni- 
NTA column. The recommended conditions are as follows: renature using a linear 
6M-hM urea gradient in 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, 
containing protease inhibitors. The renaturaiion should be performed over a period of 
1.5 hours or more. After renaturation the proteins are eluted by the addition of 250 
mM immidazole. Immidazole is removed by a final dialyzing step against PBS or 50 
mM sodium acetate pH 6 buffer plus 200 mM NaCl. The purified protein is stored at 
4**Cor frozenai-80"C. 

In addition to the above expression vector, the present invention funher 
includes an expression vector comprising phage operator and promoter elements 
operatively linked to a polynucleotide of the present invention, called pHE4a. (ATCC 
Accession Number 209645, deposited on February 25. 1998.) This vector contains: 
1 ) a neomycinphosphotransferase gene as a selection marker. 2) an E. coli origin of 
replication. 3) a T5 phage promoter sequence. 4) two lac operator sequences, 5) a 
Shine-Delgamo .sequence, and 6) the lactose operon repressor gene (laclq). The 
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origin of replication (oriC) is derived from pUC19 (LTI, Gaithersburg, MD). The 
pronnoter sequence and operator sequences are made synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 
XbaL BamHI, XhoL or'Asp718, running the restricted product on a gel, and isolating 
the larger fragment (the staffer fragment should be about 3 10 base pairs). The DNA 
insert is generated according to the PCR protocol described in Example 1, using PCR 
primers having restriction sites for Ndel (5* primer) and Xbal, BamHI, Xhol, or 
Asp718 (3' primer). The PCR insert is gel purified and restricted with compatible 
enzymes. The insert and vector are ligated according to standard protocols. 

The engineered vector could easily be substituted in the above protocol to 
express protein in a bacterial system. 

Example 6; Purification of a Polypeptide from an Inclusion Body 

The following alternative method can be used to purify a polypeptide, 
expressed in E coli when it is present in the form of inclusion bodies. Unless 
otherwise specified, all of the following steps are conducted at 4-10°C- 

Upon completion of the production phase of the E, coli fermentation, the cell 
culture is cooled to 4-10*^0 and the cells harvested by continuous centrifugation at 
15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
weight of cell paste and the amount of purified protein required, an appropriate 
amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM 
Tris, 50 miM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension 
using a high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer 
(Microfuidics. Corp. or APV Gaulin. Inc.) twice at 4000-6000 psi. The homogenate 
is then mixed with NaCl solution to a final conceniraiion of 0.5 M NaCl. followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCl, 100 mM Tris, 50 mM EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 
pellet is discarded and the polypeptide containing supernatant is incubated at 4°C 
overnight to allow funher GuHCl extraction. 
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Following high speed cenlrifugation (30,000 xg) to remove insoluble particles, 
the GuHCl solubilized protein is refolded by quickly naixing the GuHCl extract with 
20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM EDTA 
by vigorous stirring. The refolded diluted protein solution is kept at 4*^0 without 
mixing for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 
filtration unit equipped with 0.16 |iim membrane filter with appropriate surface area 
(e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
stepwise manner. The absorbance at 280 nm of the effluent is continuously 
monitored. Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 
volumes of water. The diluted sanriple is then loaded onto a previously prepared set of 
tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak 
anion (Poros CM-20, Perseptive Biosystems) exchange resins. The columns are 
equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 
mM sodium acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using 
a 10 column volume linear gradient ranging from 0.2 M NaCl. 50 nuVI sodium 
acetate, pH 6.0 to 1.0 M NaCL 50 mM sodium acetate, pH 6.5. Fractions are 
collected under constant A^^o monitoring of the effluent. Fractions containing the 
polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greater than 95% purity after the 
above refolding and purification steps. No major contaminant bands should be 
observed from Commassie blue stained 16% SDS-PAGE gel when 5 j^g of purified 
protein is loaded. The purified protein can also be tested for endotoxinA^PS 
contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL 
assays. 
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Example 7: Cloning and Expression of a Polypeptide in a Baculovirus 
Expression System 

In this example, the plasmid shuttle vector pA2 is used to insert a 
polynucleotide into a baculovirus to express a polypeptide. This expression vector 
contains the strong polyhedrin promoter of the Autographa califomica nuclear 
polyhedrosis virus (AcMNPV) followed by convenient restriction sites such as 
BamHI, Xba I and Asp718. The polyadenylation site of the simian virus 40 (':SV40") 
is used for efficient polyadenylation. For easy selection of recombinant virus, the 
plasmid contains the beta-galactosidase gene from E. coli under control of a weak 
Drosophila promoter in the same orientation, followed by the polyadenylation signal 
of the polyhedrin gene. The inserted genes are flanked on both sides by viral 
sequences for cell-mediated homologous recombination with wild-type viral DNA to 
generate a viable virus that express the cloned polynucleotide. 

Many other baculovirus vectors can be used in place of the vector above, such 
as pAc373, pVL94L and pAcIMl, as one skilled in the an would readily appreciate, 
as long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
required. Such vectors are described, for instance, in Luckow et al.. Virology 170:31- 
39(1989). 

Specifically, the cDNA sequence contained in the deposited clone, including 
the AUG initiation codon and the naturally associated leader sequence identified in 
Table 1, is amplified using the PGR protocol described in Example 1. If the naturally 
occurring signal sequence is used to produce the secreted protein, the pA2 vector does 
not need a second signal peptide. Alternatively, the vector can be modified (pA2 GP) 
to include a baculovirus leader sequence, using the standard methods described in 
Summers et al., ''A Manual of Methods for. Baculovirus Vectors and Insect Cell 
Culture Procedures." Texas Agricultural Experimental Station Bulletin No. 1555 
(1987). 

The amplified fragment is isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment 
then is digested with appropriate restriction enzymes and again purified on a 1% 
agarose gel. 
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The plasmid is digested with the corresponding restriction enzymes and 
optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 
procedures known in the art. The DNA is then isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligaied together with T4 
DNA ligase. £. coli HBlOl or other suitable E, coli hosts such as XL-1 Blue 
<Stratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 
•mixture and spread on culture plates. Bacteria containing the plasmid are identified 
.by digesting DNA from individual colonies and analyzing the digestion product by 
\gel electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five |ig of a plasmid containing the polynucleotide is co-transfected with 1.0 
|lg of a commercially available linearized baculovirus DNA ("BaculoGold™ 
baculovirus DNA", Pharmingen, San Diego, CA), using the iipofection method 
described by Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One |ig 
of BaculoGold™ virus DNA and 5 jiig of the plasmid are mixed in a sterile well of a 
microliter plate containing 50 ^il of serum-free Grace's medium (Life Technologies 
Inc., Gaithersburg, MD). Afterwards, 10 |il Lipofectin plus 90 ^1 Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the 
transfection mixture is added drop-wise to Sf9 insect cells f ATCC CRL 1711) seeded 
in a 35 mm tissue culture plate with 1 ml Grace's medium without serum. The plate is 
then incubated for 5 hours at 27'' C. The transfection solution is then removed from 
the plate and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum 
is added. Cultivation is then continued at 27° C for four days. 

After four days the supematant is collected and a plaque assay is performed, 
as described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of 
a "plaque assay" of this type can also be found in the user s guide for insect cell 
culture and baculovirology distributed by Life Technologies Inc.. Gaithersburg, page 
9-10.) After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
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resuspended in a nnicrocenirifuge tube containing 200 fll of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded 
' in 35 mm dishes. Four days later the supematants of these culture dishes are 
harvested and then they are stored at 4** C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 
medium supplemented with 10% heat-inactivated FBS. The cells are infected with 
the recombinant baculovirus containing the polynucleotide at a multiplicity of 
infection ("MOI") of about 2, If radiolabeled proteins are desired, 6 hours later the 
medium is removed and is replaced with SF900 II medium minus methionine and 
cysteine (available from Life Technologies Inc., Rockville. MD), After 42 hours, 5 
^Ci of "^^S-methionine and 5 [iCi ^'^S-cysteine (available from Amersham) are added. 
The cells are further incubated for 16 hours and then are harvested by cenirifugation. ' 
The proteins in the supernatant as well as the intracellular proteins are analyzed by 
SDS-PAGE followed by autoradiography (if radiolabeled), 

Microsequencing of the amino acid sequence of the amino terminus of 
purified protein may be used to determine the amino terminal sequence of the 
produced protein. 

Example 8: Expression of a Polvneptide in Mammalian Cells 

The polypeptide of the present invention can be expressed in a mammalian 
cell. A typical mammalian expression vector contains a promoter element, which 
mediates the initiation of transcription of mRNA, a protein coding sequence, and 
signals required for the termination of transcription and polyadenylation of the 
transcript. Additional elements include enhancers. Kozak sequences and intervening 
sequences flanked by donor and acceptor sites for RNA splicing. Highly efficient 
transcription is achieved with the early and late promoters from SV40, the long 
terminal repeats (LTRs) from Retroviruses, e.g.. RSV. HTLVI, HIVI and the early 
promoter of the cytomegalovirus (CMV). However, cellular elements can al.so be 
used (e.g.. the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala. Sweden). 
pRSVcat (ATCC 37152), pSV2dhfr { ATCC 37146), pBC12MI (ATCC 67109), 
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pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and Ci27 cells, Cos 1, 
Cos 7 and CVl, quail QCl-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

5 Alternatively, the polypeptide can be expressed in stable cell lines containing 

the polynucleotide integrated into a chromosome. The co-transfection with a 
• selectable marker such as dhfr, gpt, neomycin, hygromycin allows the identification 
and isolation of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 
10 -encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W., et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, J. 
L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M, J. and 
Sydenham, M, A., Biotechnology 9:64-68 (1991).) Another useful selection marker 
15 is the enzyme glutamine synthase (GS) (Murphy et al., Biochem J. 227:277-279 

(1991); Bebbington et al., Bio/Technology 10:169-175 (1992). Using these markers, 
the mammalian cells are grown in selective medium and the cells with the highest 
resistance are selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 
20 production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al.. Molecular and Cellular Biology, 438-447 (March, 1985)) plus a fragment of the 
25 CMV-enhancer (Boshart et al.. Cell 41:521-530 (1985).) Multiple cloning sites, e.g.. 
with the restriction enzyme cleavage sites BamHI. Xbal and Asp718, facilitate the 
cloning of the gene of interest. The vectors also contain the 3* intron, the 
polyadenylation and termination signal of the rat preproinsulin gene, and the mouse 
DHFR gene under control of the SV40 early promoter. 
30 Specifically, the plasmid pC6, for example, is digested with appropriate 

restriction enzymes and then dephosphorylaied using calf intestinal phosphates by 
procedures known in the art. The vector is then isolated from a 1% agarose gel. 
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A polynucleotide of the present invention is amplified according to the 
protocol outlined in Example I. If the naturally occurring signal sequence is used to 
produce the secreted protein, the vector does not need a second signal peptide. 
Alternatively, if the naturally occurring signal sequence is not used, the vector can be 
5 modified to include a heterologous signal sequence. (See, e.g., WO 96/34891.) 
The amplified fragment is isolated from a I % agarose gel using a 
comrnercially available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca,). The fragment 
then is digested with appropriate restriction enzymes and again purified on a 1% 
agarose gel. 

10 The amplified fragment is then digested with the same restriction enzyme and 

purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coli HB 101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 
15 Chinese hamster ovary cells lacking an active DHFR gene is used for 

transfection. Five |j.g of the expression plasmid pC6 is cotransfected with 0.5 |ig of . . 
the plasmid pSVneo using lipofectin (Feigner et al., supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme 
that confers resistance to a group of antibiotics including G418. The cells are seeded 
20 in alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 
trypsinized and seeded in hybridoma cloning plates (Greinen Germany) in alpha 
minus MEM supplemented with 10. 25. or 50 ng/ml of metoihre.xate plus 1 mg/ml 
G418. After about 10-14 days single clones are trvpsinized and then seeded in 6-well 
petri dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 
25 100 nM, 200 nM, 400 nM, 800 nM ). Clones growing at the highest concentrations of 
methotrexate are then transferred to new 6-well plates containing even higher 
concentrations of methotrexate ( 1 uM, 2 |aM, 5 jiM, 10 mM, 20 mM). The same 
procedure is repealed until clones are obtained which grow at a concentration of 100 - 
200 |lM, Expression of the desired gene product is analyzed, for instance, by SDS- 
30 PAGE and Western blot or by reversed phase HPLC analysis. 

Example 9: Protein Fusions 
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The polypeptides of the present invention are preferably fused to other 
proteins. These fusion proteins can be used for a variety of applications. For 
..example, fusion of the present polypeptides to His-iag, HA-tag, protein A, IgG 
domains, and maltose binding protein facilitates purification. (See Example 5; see 
5 also EP A 394,827; Traunecker, et al., Nature 33 1 :84-86 ( 1988),) Similarly, fusion to 
IgG-1, IgG-3, and albumin increases the halflife time in vivo. Nuclear localization 
'^signals fused to the polypeptides of the present invention can target the protein to a 
specific subcellular localization, while covalent heterodimer or homodimers can 
increase or decrease the activity of a fusion protein. Fusion proteins can also create 

10 'chimeric molecules having more than one function. Finally, fusion proteins can 
increase solubility and/or stability of the fused protein compared to the non-fused 
protein. All of the types of fusion proteins described above can be made by 
modifying the following protocol, which outlines the fusion of a polypeptide to an 
IgG molecule, or the protocol described in Example 5. 

15 Briefly, the human Fc portion of the IgG molecule can be PGR amplified, 

using primers that span the 5' and 3' ends of the sequence described below. These 
primers also should have convenient restriction enzyme sites that will facilitate 
cloning into an expression vector, preferably a mammalian expression vector. 
V For example, if pC4 (Accession No. 209646) is used, the human Fc portion 

20 can be ligated into the BamHI cloning site. Note that the 3' BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restricted with 
BamHI. linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PGR protocol described in Example L is ligated into this BamHI site. Note 
that the polynucleotide is cloned without a stop codon. otherwise a fusion protein will 

25 not be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
heterologous signal sequence. (See. e.g., WO 96/34891.) 

30 

Human IgG Fc region; 
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GGGATCCGGAGCCCAAATCTTCTGACAAAACTCACACATGCCCACCGTGC 

CCAGCACCTGAATTCGAGGGTGCACCGTCAGTCTTCCTCTTCCCCCCAAAA 

CCCAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGT 

GGTGGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGG 

ACGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTA 

CAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACT 

GGCTGAATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCTCCCA 

ACCCCCATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAAC 

CACAGGTGTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAG 

GTCAGCCTGACCTGCCTGGTCAAAGGCTTCTATCCAAGCGACATCGCCGT 

GGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCT 

CCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTACAGCAAGCTCACCGTG 

GACAAGAGCAGGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCA 

TGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCTCCCTGTCTCCGG 

GTAAATGAGTGCGACGGCCGCGACTCTAGAGGAT (SEQ ID NO:.l) 

Example 10: Production of an Antibody from a Polypeptide 

The antibodies of the present invention can be prepared by a variety of 
methods. (See, Current Protocols, Chapter 2.) For example, cells expressing a 
polypeptide of the present invention is adnainistered to an animal to induce the 
production of sera containing polyclonal antibodies. In a preferred method, a 
preparation of the secreted protein is prepared and purified to render it substantially 
free of natural contaminants. Such a preparation is then introduced into an animal in 
order to produce polyclonal antisera of greater specific activity. 

In the most preferred method, the antibodies of the present invention are 
monoclonal antibodies (or protein binding fragments thereoO. Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et ai.. Nature 
256:495 (1975); Kohler et aL, Eur. J. Immunol. 6:51 1 (1976); Kohler et al., Eur J. 
Immunol. 6:292 (1976): Hammerling et al., in: Monoclonal Antibodies and T-Cell 
Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
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any suitable tissue culture medium: however, it is preferable to culture cells in Earle's 
- modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 
about 56°C), and supplemented with about 10 g/1 of nonessential amino acids, about 
1,000 U/ml of penicillin, and about 100 ^g/ml of streptomycin. 

The splenocytes of such mice are extracted and fused with a suitable myeloma 
cell line. Any suitable myeloma cell line may be employed in accordance with the 
present invention; however, it is preferable to employ the parent myeloma cell line 
(SP20), available from the ATCC. After fusion, the resulting hybridoma cells are 
...selectively maintained in HAT medium, and then cloned by limiting dilution as 
described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybridoma 
cells obtained through such a selection are then assayed to identify clones which 
secrete antibodies capable of binding the polypeptide. 

Alternatively, additional antibodies capable of binding to the polypeptide can 
be produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
and the hybridoma cells are screened to identify clones which produce an antibody 
whose ability to bind to the protein-specific antibody can be blocked by the 
polypeptide. Such antibodies comprise anti-idiotypic antibodies to the protein- 
specific antibody and can be used to immunize an animal to induce formation of 
further protein-specific antibodies. 

It will be appreciated that Fab and F(ab')2 and other fragments of the 
antibodies of the present invention may be used according to the methods disclosed 
herein. Such fragments are typically produced by proteolytic cleavage, using 
enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 
fragments). Alternatively, secreted protein-binding fragments can be produced 
through the application of recombinant DN A technology or through synthetic 
chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced 
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using genetic constructs derived from hybridoma cells producing the monoclonal 
antibodies described above. Methods for producing chimeric antibodies are known in 
the art. (See, for review, xMorrison, Science 229:1202 (1985): Oi et al., 
BioTechniques 4:214 (1986); Cabilly et al., U.S. Patent No. 4.816,567: Taniguchi et 
al., EP 171496; Morrison et al., EP 173494: Neuberger et al., WO 8601533; Robinson 
et al., WO 8702671; Boulianne et al.. Nature 312:643 (1984): Neuberger et al.. Nature 
314:268(1985).) 

Example 11: Production Of Secreted Protein For High-Throughput Screening 
Assays 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(Img/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-516F Biowhittaker) for 
a working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well 
plates) and incubate at RT for 20 minutes. Be sure to distribute the solution over each 
well (note: a 12rchannel pipetter may be used with tips on ever\' other channel). 
Aspirate off the Poly-D-Lysine solution and rinse with Iml PBS (Phosphate Buffered 
Saline). The PBS should remain in the well until just prior to plating the cells and 
plates may be poly-lysine coated in advance for up to two weeks. 

Plate 293T cells (do not carry cells past P+20) at 2x10' cells/well in .5ml 
DMEM(Dulbecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 
(12-604F Biowhittaker))/ 10% heat inactivated FBS( 14-503F Biowhittaker)/ Ix 
Penstrep(17-602E Biowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin: 300 ul Lipofeciamine 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco/BRL)/96-vvell plate. 
With a small volume multi-channel pipetter, aliquot approximately 2ug of an 
expression vector containing a polynucleotide insert, produced by the methods 
described in Examples 8 or 9. into an appropriately labeled 96-well round bottom 
plate. With a multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I 
mixture to each well. Pipette up and down gently to mix. Incubate at RT 15-45 
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minutes. After about 20 minutes, use a multi-channel pipelter to add 150ul Optimem 
I to each well. As a control, one plate of vector DNA lacking an insert should be 
transfected with each set of transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
5 tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates 
. of cells, and then person B rinses each well with .5- 1ml PBS. Person A then aspirates 

off PBS rinse, and person B, using al2-channel pipetter with tips on every other 
. channel, adds the 200ul of DNA/Lipofectamine/Optimem I complex to the odd wells 
10 first, then to the even wells, to each row on the 24-well plates. Incubate at 37^C for 6 
hours. 

While cells are incubating, prepare appropriate media, either 1%BSA in 
DMEM with Ix penstrep, or CHO-5 media ( 1 16.6 mg/L of CaC12 (anhyd); 0.00130 
mg/L CUSO4-5H.O; 0,050 mg/L of Fe(N03).r9H20; 0.4 17 mg/L of FeSO^-THjO; 
15 3 1 1.80 mg/L of Kcl; 28.64 mg/L of MgCl,: 48.84 mg/L of MgS04; 6995.50 mg/L of 
NaCl; 2400.0 mg/L of NaHCO,; 62.50 mg/L of NaH,PO4-H,0; 71.02 mg/L of 
Na2HP04; .4320 mg/L of ZnS04-7H,0; .002 mg/L of Arachidonic Acid ; 1 .022 mg/L 
of Cholesterol; .070 mg/L of DL-alpha-TocopheroI- Acetate; 0.0520 mg/L of Linoleic 
. Acid; 0.010 mg/L of Linolenic Acid; 0.010 mg/L of Myristic Acid; 0.010 mg/L of 
20 Oleic Acid; 0.010 mg/L of Palmitric Acid: 0.010 mg/L of Palmitic Acid; 100 mg/L of 
Pluronic F-68; 0.010 mg/L of Stearic Acid; 2.20 mg/L of Tween 80; 455 1 mg/L of D- 
Glucose; 130.85 mg/ml of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mg/ml 
of L-Asparagine-H.O; 6.65 mg/ml of L-Aspartic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H.0; 31.29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-Glutamic Acid; 365.0 
25 mg/ml of L-Glutamme; 18.75 mg/ml of Glycine: 52.48 mg/ml of L-Histidine-HCL- 
H,0; 106.97 mg/ml of L-Isoleucine: II 1.45 mg/ml of L-Leucine; 163.75 mg/ml of L- 
Lysine HCL; 32.34 mg/ml of L-Methionine; 68.48 mg/ml of L-Phenylalainine; 40.0 
mg/ml of L-Proline: 26.25 mg/ml of L-Serine; 101.05 mg/ml of L-Threonine: 19,22 
mg/ml of L-Tryptophan: 91.79 mg/ml of L-Tryrosine-2Na-2H,0; 99.65 mg/ml of L- 
30 Valine: 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate: 1 1.78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid: 15.60 mg/L of i-Inosiiol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxal HCL; 0.031 mg/L of Pyridoxine HCL; 0.319 
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mg/L of Riboflavin; 3.17 mg/L of thiamine HCL; 0.365 mg/L of Thymidine: and 
0.680 mg/L of Vitamin B,,; 25 mM of HEPES Buffer: 2.39 mg/L of Na 
Hypoxanthine; 0.105 mg/L of Lipoic Acid; 0.081 mg/L of Sodium Putrescine-2HCL; 
55.0 mg/L of Sodium Pyruvate: 0.0067 mg/L of Sodium Selenite; 20uM of 
Ethanolamine; 0.122 mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyclodextrin 
complexed with Linoleic Acid: 33.33 mg/L of Methyl-B-Cyclodextrin complexed 
with Oleic Acid: and 10 mg/L of Methyl-B-Cyclodextrin complexed with Retinal) 
with 2mm glutamine and Ix penstrep. (BSA (81-068-3 Bayer) lOOgm dissolved in IL 
DMEM for a 10% BSA stock solution). Filter the media and collect 50 ul for 
endotoxin assay in 15ml polystyrene conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end 
of the incubation period. Person A aspirates off the transfection media, while person 
B adds 1.5ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 
depending on the media used: I %BSA for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one iml 
deep well plate and the remaining supernatant into a 2ml deep well. The supematants 
from each well can then be used in the assays described in Examples 13-20. 

It is specifically understood that when activity is obtained in any of the assays 
described below using a supernatant, the activity originates from either the 
polypeptide directly (e.g., as a secreted protein ) or by the polypeptide inducing 
expression of other proteins, which are then .secreted into the supernatant. Thus, the 
invention further provides a method of identifying the protein in the supernatant 
characterized by an activity in a particular assay. 

Example 12: Construction of GAS Reporter Construct 

One signal transduction pathway involved in the differentiation and 
proliferation of cells is called the Jaks-STATs pathway. Activated proteins in the 
Jaks-STATs pathway bind to gamma activation site ''GAS'' elements or interferon- 
sensitive responsive element ('*ISRE'*), located in the promoter of many genes. The 
binding of a protein to these elements alter the expression of the associated gene. 

GAS and ISRE elements are recognized by a class of transcription factors 
called Signal Transducers and Activators of Tran.scription, or "STATs.*' There are six 
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members of the STATs family. Slatl and Stat3 are present in many cell types, as is 
Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many cell types though it has been found in T helper class I, cells after treatment with 
IL-I2. StatS was originally called mammary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in 
tissue culture cells by many cytokines. 

The STATs are activated to translocate from the cytoplasm to the nucleus 
upon tyrosine phosphorylation by a set of kinases known as the Janus Kinase C'Jaks*') 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

The Jaks are activated by a wide range of receptors summarized in the Table 
below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621- 
51 (1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Class 1 includes receptors for IL-2. IL-3, IL-4, IL-6, IL-7, IL-9, IL-1 1, IL- 
12, IL-15. Epo, PRL, GH, G-CSR GM-CSF, LIF, CNTF, and thrombopoietin; and (b) 
Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a conserved 
cysteine motif (a set of four conserved cysteines and one tryptophan) and a WSXWS 
motif (a membrane proximal region encoding Trp-Ser-Xxx-Trp-Ser (SEQ ID NO:2)). 

Thus, on binding of a ligand to a receptor. Jaks are activated, which in tum 
activate STATs, which then translocate and bind to GAS elements. This entire 
process is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines 
are known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using 
GAS elements linked to reporter molecules, activators of the Jaks-STATs pathway 
can be identified. 
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PRL - ? +/- + - 1,3,5 

EPO ? - + - 5 GAS(B-CAS>IRFl=IFP»Ly6) 
Receptor Tyrosine Kinases 

EOF ? + + - 1.3 GAS(IRFl) 

PDGF ? + + - 1,3 

CSF-l: ? + + . 1,3 GAS(notlRFl) 
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To construct a synthetic GAS containing promoter element, which is used in 
the Biological Assays described in Examples 13-14, a PCR based strategy is 
employed to generate a GAS-SV40 promoter sequence. The 5' primer contains four 
tandem copies of the GAS binding site found in the IRFl promoter and previously 
5 demonstrated to bind STATs upon induction with a range of cytokines (Rothman et 
al.. Immunity 1:457-468 (1994).), although other GAS or ISRE elements can be used 
instead. The 5' primer also contains 18bp of sequence complementary to the SV40 
early promoter sequence and is flanked with an Xhol site. The sequence of the 5' 
primer is: 

10 S^GCGCCTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCC 
GAAATGATTTCCCCGAAATATCTGCCATCTCAATTAG:3' (SEQ ID N0:3) 

The downstream primer is complementary to the SV40 promoter and is 
flanked with a Hind III site: 5*:GCGGCAAGCTTTTTGCAAAGCCTAGGC:3' 
(SEQ ID NO:4) 

15 PCR amplification is performed using the SV40 promoter template present in 

the B-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
. sequence: 

20 S^CTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCCGAAA 
TGATTTCCCCGAAATATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCG 
CCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCT 
CCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCC 
TCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCT 

25 AGGCTTTTGCAAA AAGCTT :3' (SEQIDN0:5) 

With this GAS promoter element linked to the S V40 promoter, a GAS:SEAP2 
reponer construct is next engineered. Here, the reporter molecule is a secreted 
alkaline phosphatase, or "SEAP/' Clearly, however, any reporter molecule can be 
instead of SEAP, in this or in any of the other Examples, Well known reporter 

30 molecules that can be used instead of SEAP include chloramphenicol 

acetyltransfera.se (CAT), luciferase, alkaline phosphatase, B-galactosidase, green 
fluorescent protein (GPP), or any protein detectable by an antibody. 
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The above sequence confirmed synthetic GAS-SV40 promoter. element is 
subcloned into the pSEAP-Promoier vector obtained from Clontech using Hindlll and 
Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 
promoter element, to create the GAS-SEAP vector. However, this vector does not 
contain a neomycin resistance gene, and therefore, is not preferred for mammalian 
expression systems. 

Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEAP vector using 
Sail and NotI, and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontech), using these restriction sites in the multiple cloning 
site, to create the GAS-SEAP/Neo vector. Once this vector is transfected into 
mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

Other constructs can be made using the above description and replacing GAS 
with a different promoter sequence. For example, construction of reporter molecules 
containing NFK-B and EGR promoter sequences are described in Examples 15 and 
16. However, many other promoters can be substituted using the protocols described 
in these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 
construct activity, such as HELA (epitheliar). HUVEC (endothelial). Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13; High>Throughput Screening Assav for T^cell Activity. 

The following protocol is used to assess T-cell activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the ability to activate the Jaks- 
STATS signal transduction pathway. The T-cell used in this assay is Jurkat T-cells 
(ATCC Accession No. TIB- 152). although Molt-3. cells (ATCC Accession No. CRL- 
1552) and Molt-4 cells (ATCC Accession No. CRL-1582) cells can also be used. 
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Jurkat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are iransfected with the GAS- 
SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected cells are seeded to a density of approximately 
20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. 
Resistant colonies are expanded and then tested for their response to increasing 
concentrations of interferon gamma. The dose response of a selected clone is 
demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 
generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in 
RPMI + 10% serum with l%Pen-Strep. Combine 2.5 mis of OPTI-MEM (Life 
Technologies) with 10 ug of plasmid DNA in a T25 flask. Add 2.5 ml OPTI-MEM 
containing 50 ul of DMRIE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 
number of cells (10^ per transfection), and resuspend in OPTI-MEM to a final 
concentration of 10^ cells/ml. Then add 1ml of 1 x 10' cells in OPTI-MEM to T25 
flask and incubate at 3TC for 6 hrs. After the incubation, add 10 ml of RPMI + 15% 
serum. 

The Jurkat:GAS-SEAP stable reponer lines are maintained in RPMI + 10% 
serum, 1 mg/ml Genticin, and \ % Pen-Strep. These cells are treated with 
supernatants containing a polypeptide as produced by the protocol described in 
Example 1 1. 

On the day of treatment with the supernatant, the cells should be washed and 
resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 

exact number of cells required will depend on the number of supernatants being 
screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 
million cells) are required. 

Transfer the cells to a triangular reservoir boat, in order to dispense the cells 
into a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 
200 ul of cells into each well (therefore adding 100, 000 cells per well). 
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After all the plates have been seeded, 50 ul of the supematants are transferred 
directly from the 96 well plate containing the supematants into each well using a 12 
channel pipette^ In addition, a dose of exogenous interferon gamma (0.1, 1.0, 10 ng) 
is added to wells H9. HIO, and H 1 1 to serve as additional positive controls for the 
5 assay. 

The 96 well dishes containing Jurkat cells treated with supematants are placed 
in an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul 
samples from each well are then transferred to an opaque 96 well plate using a 12 
channel pipette. The opaque plates should be covered (using sellophene covers) and 

10 stored at -20^C until SEAP assays are performed according to Example 17. The 

plates containing the remaining treated cells are placed at 4^C and ser\'e as a source 
of material for repeating the assay on a specific well if desired. 

As a positive control, 100 Unit/ml interferon gamma can be used which is 
known to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
15 positive control wells. 

The above protocol may be used in the generation of both transient, as well as, 
stable transfected cells, which would be apparent to those of skill in the art. 

Example 14: Hi gh-Throughput Screening Assav Identifvinp Mfveloid Activity 

20 The following protocol is used to assess myeloid activity by identifying 

factors, such as growth factors and cytokines, that may proliferate or differentiate 
myeloid cells. Myeloid cell activity is assessed using the GAS/SEAP/Neo construct 
produced in Example 12. Thus, factors that increase SEAP activity indicate the 
ability to activate the Jaks-STATS signal transduction pathway. The myeloid cell 

25 used in this assay is U937, a pre-monocyie cell line, although TF-1, HL60. or KGl 
can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct 
' produced in Example 12, a DEAE-Dextran method (Kharbanda ei. al., 1994, Cell 

Growth & Differentiation, 5:259-265 ) is used. First, harvest 2x1 06*7 U937 cells and 
30 wash with PBS. The U937 cells are usually grown in RPMI 1640 medium containing 
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10% heat-inactivated fetal bovine serum (FBS) supplemented with 100 units/ml 
penicillin and 100 mg/ml streptomycin. 

Next, suspend the cells in 1 ml of 20 mlvl Tris-HCl (pH 7,4) buffer containing 
0.5 mg/ml DEAE-Dextran, 8 us GAS-SEAP2 plasmid DNA. 140 miVI NaCl, 5 mM 
KCl 375 uM Na2HP04.7H20, 1 mM MgCh. and 675 uM CaCb. Incubate at 370C 
for 45 min. 

Wash the ceils with RPMI 1640 medium containing 10% FBS and then 
resuspend in 10 ml complete medium and incubate at 37^C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 
months, the cells should be re-grown in 400 ug/ml G418 for couple ot passages. 

These cells are tested by harvesting 1x10 ceils (this is enough for ten 96-well 
plates assay) and wash with PBS. Suspend the ceils in 200 ml above described 
growth medium, with a final density of 5x10- cells/ml. Plate 200 ui ceils per well in 
the 96-well plate (or IxlO'^ cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 

II. Incubate at 31^C for 48 to 72 hr. As a positive control. 100 Unit/ml interferon 
gamma can be used which is known to activate U937 cells. Over 30 fold induction is 
typically observed in the positive control wells. SEAP assay the supernatant 
according to the protocol described in Example 17. 

Example 15: Hifgh-T hrou^hput Screeninfi Assay Identifying Neuronal Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 
EGRl (early growth response gene I), is induced in various tissues and cell types 
upon activation. The promoter of EGRl is responsible for such induction. Using the 
EGRl promoter hnked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC12 ceils (rat phenochromocytoma cells) are known to proliferate and/or 
differe:niiate by activation with a number of mitogens, such as TP A (tetradecanoyi 
phorbol acetate), NGF (nerve growth factor), and EGF (epidermal growth factor). 
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The EGRl gene expression is activated during this treatment. Thus, by stably 
transfecting PC 12 cells with a construct containing an EGR promoter linked to SEAP 
reporter, activation of PC 12 cells can be assessed. 

The EGR/SEAP reporter construct can be assembled by the following 
protocol. The EGR-1 promoter sequence (-633 to +I)(Sakamoto K et al.. Oncogene 
6:867-871 (1991)) can be PCR amplified from human genomic DNA using the 
following primers: 

5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3' (SEQ ID N0:6) 
5* GCGAAGCTTCGCGACTCCCCGGATCCGCCTC-3' (SEQ ID N0:7) 
Using the GAS:SEAP/Neo vector produced in Example 12, EGRl amplified 
product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll. removing the GAS/SV40 stuffer. Restrict the 
EGRl amplified product with these same enzymes. Ligate the vector and the EGRl 
promoter. 

To prepare 96 well-plates for cell culture, two mis of a coating solution (1:30 
dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-well plate, and 
allowed to air dry for 2 hr. 

PC12 cells are routinely grown in RPMI-1640 medium (Bio Whittaker) 
containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 57o heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 
100 ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is 
done every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

Transfect the EGR/SEAP/Neo construct into PC 12 using the Lipofectamine 
protocol described in Example 1 1 . EGR-SEAP/PCI2 stable cells are obtained by 
growing the cells in 300 ug/mJ G418. The G418-free medium is used for routine 
growth but every one to two months, the cells should be re-grown in 300 ug/ml. G4 18 
for couple of passages. 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 80% 
confluent is screened by removing the old medium. Wash the cells once with PBS 
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(Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the cells well in 2 ml low serum medium. Count 

the cell number and add more low serum medium to reach final cell density as 5x10^ 
cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
1x10^ cells/well). Add 50 ul supernatant produced by Example 1 1, 37^0 for 48 to 72 
hr. As a positive control, a growth factor known to activate PC 1 2 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 1 7. 

Example 16: High-Throughput Screening Assav for T-cell Activity 

NF-kB (Nuclear Factor kB) is a transcription factor activated by a wide 
variety of agents including the inflanmiatory cytokines IL-1 and TNF, CD30 and 
CD40, lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, 
and by expression of certain viral gene products. As a transcription factor, NF-kB 
regulates the expression of genes involved in immune cell activation, control of 
apoptosis (NF- kB appears to shield cells from apoptosis). B and T-ccll development, 
anti-viral and antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 
(Inhibitor kB). However, upon stimulation, I- kB is phosphorylated and degraded, 
causing NF- fcB to shuttle to the nucleus, thereby activating transcription of target 
genes. Target genes activated by NF- kB include IL-2. IL-6. GM-CSF, ICAM-1 and 
class 1 MHC. 

Due to its central role and ability to respond to a range of stimuli, reponer 
constructs utilizing the NF-kB promoter element are used to screen the supernatants 
produced in Example II. Activators or inhibitors of NF-kB would be useful in 
treating diseases. For example, inhibitors of NF-kB could be used to treat those 
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diseases related to the acute or chronic activation of NF-kB. such as rheumatoid 
arthritis. 

To construct a vector containing the NF-icB promoter element, a PCR based 
strategy is employed. The upstream primer contains four tandem copies of the NF-kB 
5 binding site (GGGGACTTTCCC) (SEQ ID N0:8), 18 bp of sequence complementary 
to the 5' end of the SV40 early promoter sequence, and is flanked with an Xhol site: 
5^GCGGCCTCGAGGGGA(mTCCCGGGGACTTTCCGGGGACTTTCCGGGAC 
TTTCCATCCTGCCATCTCAATTAG:3* (SEQ ID NO:9) 

The downstream primer is complementary to the 3* end of the SV40 promoter 
10 and is flanked with a Hind III site: 

5\GCGGCAAGCTTTTTGCAAAGCCTAGGC:3' (SEQ ID N0:4) 

PCR amplification is performed using the SV40 promoter template present in 
the pB-gal:promoter plasmid obtained from Cloniech. The resulting PCR fragment is 
digested with Xhol and Hind III and subcloned into BLSIC2-. (Stratagene) 
15 Sequencing with the T7 and T3 primers confirms the insert contains the following 
sequence: 

5':CTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGACTTTCC 
ATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCC 
20 ATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGA 
CTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTA 
TTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAA 
GCTT:3' (SEQ ID NO: 10) 

25 Next, replace the SV40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-icB/SV40 fragment using Xhol and 
Hindlll, However, this vector does not contain a neomycin resistance gene, and 
therefore, is not preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-KB/SV40/SEAP 

30 cassette is removed from the above NF-icB/SEAP vector using restriction enzymes 

Sail and NotI, and inserted into a vector containing neomycin resistance. Particularly, 
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the NF-KB/SV40/SEAP cassette was inserted into pGFP-1 (Clontech), replacing the 
GFP gene, after restricting pGFP-l with Sail and Notl. 

Once NF-KB/SV40/SEAP/Neo vector is created, stable Jurkat T-cells are 
created and maintained according to the protocol described in Example 13. Similarly » 
5 the method for assaying supematants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1, 10 ng) is added to 
wells H9, HIO, and HI 1, with a 5-10 fold activation typically observed. 

Example 17: Assav for SEAP Activity 

10 As a reporter molecule for the assays described in Examples 13-16, SEAP 

activiry is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x Dilution Buffer and dispense 15 |il of 2.5x 

15 dilution buffer into Optiplates containing 35 ^1 of a supernatant. Seal the plates with 
a plastic sealer and incubate at 65^C for 30 min. Separate the Optiplates to avoid 
uneven heating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser 
and prime with the Assay Buffer. Add 50 p.1 Assay Buffer and incubate at room 

20 temperature 5 min. Empty the dispen.ser and prime with the Reaction Buffer (see the 
table below). Add 50 |il Reaction Buffer and incubate at room temperature for 20 
minutes. Since the intensity of the chemilumine.scent signal is time dependent, and it 
takes about 10 minutes to read 5 plates on luminometer, one should treat 5 plates at 
each time and start the second set 10 minutes later. 

25 Read the relative light unit in the luminometer. Set H 12 as blank, and print 

the results. An increase in chemilummescence indicates reporter activity. 

Reaction Buffer Formulation: 

#of plates Rxn buffer diluent (ml) CSPD(ml.) 

11 65 3.25 

12 70 3.5 
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Example 18; High-Throughput Screening Assay Identifying Changes in Small 
Molecule Concentration and Membrane Permeability 

Binding of a ligand to a receptor is known to alter intracellular levels of small 
molecules, such as calcium, potassium, sodium, and pH. as well as alter membrane 
potential. These alterations can be measured in an assay to identify supernatanls 
which bind to receptors of a particular cell. Although the following protocol 
describes an assay for calcium, this protocol can easily be modified to detect changes 
in potassium, sodium, pH, membrane potential, or any other small molecule which is 
detectable by a fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ('TLIPR*') to 
measure changes in fluorescent molecules (Molecular Probes) thai bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-4 (MDlecular Probes, inc. ; 
catalog no. F-14202) , used here. 

For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star 
black 96-well plate with clear bottom. The plate is incubated in a CO;, incubator for 
20 hours. The adherent cells are washed two times in Bioiek washer with 200 ul of 
HiBSS (Hank's Balanced Salt Solution) leaving 100 ul of buffer after the final wash. 

A stock solution of 1 mg/ml nuo4 is made in [09c pluronic acid DMSO. To 
load the cells with nuo-4 , 50 ul of 12 ug/ml nuo-4 is added to each well. The plate 
is incubated at 3TC in a CO, incubator for 60 min. The plate is washed four times in 
the Biotek washer with HBSS leaving 100 ul of buffer. 

For non-adherent cells, the cells are spun down from culture media. Cells are 
re-suspended to 2-5x10'^ cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-4 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 3TC water bath for 30-60 min. The cells are washed 
twice with HBSS, resuspended to 1x10^ cells/ml. and dispensed into a microplaie, 100 
ul/well. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed 
once in Denley CellWash with 200 ul. followed by an aspiration step to 100 ul final 
volume. 
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For a non-cell based assay, each well contains a fluorescent molecule, such as 
. fluo-4 . The supernatant is added to the well, and a change in fluorescence is 
. detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
5 following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 
second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; 
;and (6) Sample addition is 50 ul. Increased emission at 530 nm indicates an 
extracellular signaling event which has resulted in an increase in the intracellular 
.Ca"*"^ concentration. 

.10 

Example 19: High-Throughput Screening Assav Identifvinf? Tyrosine Kinase 
Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine 

1 5 Kinase RPTK) group are receptors for a range of mitogenic and metabolic growth 
factors including the PDGF, FGF. EGF, NGF, HGF and Insulin receptor subfamilies. 
In addition there are a large family of RPTKs for which the corresponding ligand is 
.unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
■membrane-bound and extracellular matrix proteins. 

20 Activation of RPTK by ligands involves ligand-mediated receptor 

dimerization, resulting in transphosphorylation of the receptor subunits and activation 
of the cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include 
receptor associated tyrosine kinases of the src-family (e.g., src, yes. Ick. lyn, fyn) and 
non-receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, 

25 members ot" which mediate signal transduction triggered by the cytokine superfamily 
of receptors (e.g., the Interleukins, Interferons. GM-CSF, and Leptin). 

Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of 
activating tyrosine kinase signal transduction pathways are of interest. Therefore, the 

30 following protocol is designed to identify tho.se novel human secreted proteins 
capable of activating the tyrosine kina.se signal transduction pathways. 
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Seed target cells (e.g., primary keraiinocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, IL). The plates arc sterilized with two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 
with 100 ml of cell culture grade type I collagen (50 mg/ml), gelatin (2%) or 
polylysine (50 mg/ml), all of which can be purchased from Sigma Chemicals (St. 
Louis, MO) or 10% Matrigel purchased from Becton Dickinson (Bedford,MA), or 
calf serum, rinsed with PBS and stored at 4^C. Cell growth on these plates is assayed 
by seeding 5,000 cells/well in growth medium and indirect quantitation of cell 
number through use of alamarBlue as described by the manufacturer Alamar 
Biosciences, Inc. (Sacramento, CA) after 48 hr. Falcon plate covers #3071 from 
Becton Dickinson (Bedford,MA) are used to cover the Loprodyne Silent Screen 
Plates. Falcon Microtest III cell culture plates can also be used in some proliferation 
experiments. 

To prepare extracts, A431 cells are seeded onto the nylon membranes of 
Loprodyne plates (20,000/200mJ/well) and cultured overnight in complete medium. 
Cells are quiesced by incubation in serum-free basal medium for 24 hr. After 5-20 
minutes treatment with EOF (60ng/ml) or 50 ul of the supernatant produced in 
Example 11, the medium was removed and 100 ml of extraction buffer ((20 mM 
HEPES pH 7.5, 0.15 M NaCl, 1% Triton X-100. 0:i% SDS. 2 mM Na3V04, 2 mM 
Na4P207 and a cocktail of protease inhibitors (# 1836170) obtained from 
Boeheringer Mannheim (Indianapolis, IN) is added to each well and the plate is 
shaken on a rotating shaker for 5 minutes at 4^C, The plate is then placed in a 
vacuum transfer manifold and the extract filtered through the 0.45 mm membrane 
bottoms of each well using house vacuum. Extracts are collected in a 96-well 
catch/assay plate in the bottom of the vacuum manifold and immediately placed on 
ice. To obtain extracts clarified by centrifugation, the content of each well, after 
detergent solubilization for 5 minutes, is removed and centrifuged for 15 minutes at 
40C at 16,000 xg. 
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Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described 
here. 

Generally, the tyrosine kinase activity of a supernatant is evaluated by 
5 determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 
biotinylated peptide). Biotinylated peptides that can be used for this purpose include 
PSKl (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
•PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
-a range of tyrosine kinases and are available from Boehringer Mannheim. 
10 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then lOul ATP/Mg2+ (5mM 
ATP/50mM MgCb), then lOul of 5x Assay Buffer (40mM imidazole hydrochloride, 
pH7.3, 40 mM beta-glycerophosphate. ImM EGTA, lOOmM MgCh, 5 mM xMnCh, 
0.5 mg/ml BSA), then 5ul of Sodium Vanadate! ImM), and then 5ul of water. Mix the 
15 components gently and preincubate the reaction mix at 30^C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 
120nnLm EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transfeiring 50 ul aliquot of reaction 

20 mixture to a microliter plate (MTP) module and incubating at 37^C for 20 min. This 
allows the streptavadin coated 96 well plate to associate with the biotinylated peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidaseunti-P-Tyr- 

POD(0.5u/ml)) to each well and incubate at 37^0 for one hour. Wash the well as 
25 above. 

Next add lOOul of peroxidase substrate solution ( Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELISA reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
30 tyrosine kinase activity. 
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Example 20: Hiyh-Throughput Screening Assay Identifying Phosphorylation 
Actiyltv 

As a potential alternative and/or compliment to the assay of protein tyrosine 
kinase activity described in Example 19, an assay which detects activation 
(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p38 MAP, Map kinase kinase (MEK), MEK kinase, 
Src, Muscle specific kinase (MuSK). IRAK, Tec, and Janus, as well as any other 
phosphoserine. phosphotyrosine, or phosphothrconine molecule, can be detected by 
substituting these molecules for Erk-l or Erk-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96-well ELISA 
plate with 0.1ml of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are 
then rinsed with PBS and blocked with 3% BSA/PBS for 1 hr at RT. The protein G 
plates are then treated with 2 commercial monoclonal antibodies ( lOOng/well) against 
Ertl 

and Erk-2 (1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of. 
the above described molecules.) After 3-5 rinses with PBS. the plates are stored at 
4^C until use. 

A431 cells are seeded at 20,000/well in a 96- well Loprodync filterplate and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ngAveli) or 50 ul of the supematants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 
filtered directly into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As 
a positive control, a commercial preparation of MAP kinase ( lOng/well) is used in 
place 

of A43 1 extract. Plates are then treated with a commercial polyclonal (rabbit) 
antibody (lug/ml) which specifically recognizes the phosphorylated epitope of the 
Erk-1 and Erk-2 kinases (1 hr at RT). This antibody is biotinylated by standard 
procedures. The bound polyclonal antibody is then quantiiated by successive 
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incubations with Europium-streptavidin and Europium fluorescence enhancing 
reagent in the Wallac DELFIA instrument (time-resolved fluorescence). An increased 
fluorescent signal over background indicates a phosphorylation. 

5 g?cample 21; Method of Determining AUerations in 9 G^ne Corresponding to ^ 
Polynucleotide 

- RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated from 
these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA 

10 is then used as a template for PCR, employing primers surrounding regions of interest 
in SEQ ID NO:X.. Suggested PCR conditions consist of 35 cycles at 95°C for 30 
seconds; 60- 1 20 seconds at 52o8°C; and 60- 1 20 seconds at 70°C, using buffer 
solutions described in Sidransky, D.. et al., Science 252:706 (1991). 

PCR products are then sequenced using primers labeled at their 5* end with T4 

15 polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre 

Technologies). The intron-exon borders of selected exons is also determined and 
genomic PCR products analyzed to confirm the results. PCR products harboring 
suspected mutations is then cloned and sequenced to validate the results of the direct 
sequencing. 

20 ^ PCR products is cloned into T-tailed vectors as described in Holton. T.A. and 
Graham, M.W., Nucleic Acids Research, 19:1 156 (1991 ) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 

25 alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 
according to Example 2 are nick-translated with digoxigenindeoxy-uridine 5'- 
triphosphate (Boehringer Manheim). and FISH performed as described in Johnson. 
Cg. et al., Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe 
is carried out using a vast excess of human cot- 1 DNA for specific hybridization to 

30 the corresponding genomic locus. 

Chromosomes are counterstained with 4.6-diamino-2-phenylidole and 
propidium iodide, producing a combination of C- and R-bands. Aligned images for 
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precise mapping are obtained using a iriple-band filler set (Chronna Technology, 
Brattleboro, VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 
et al.. Genet. Anal. Tech. AppL, 8:75 (1991)..) Image collection, analysis and 
5 chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham, NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. These alterations are used as a diagnostic marker for an associated 
disease. 

10 

Example 22: Method of Detecting Abnormal Levels of a Polypeptide in a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
and if an increased or decreased level of the polypeptide is detected, this polypeptide 
15 is a marker for a particular phenotype. Methods of detection are numerous, and thus, 
it is understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich ELISAs are used to detect polypeptides in a 
sample, preferably a biological sample. Wells of a microtiter plate are coated with 
20 specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The antibodies are 
either monoclonal or polyclonal and are produced by the method described in 
Example 10. The wells are blocked so that non-specific binding of the polypeptide to 
the well is reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
25 containing the polypeptide. Preferably, serial dilutions of the sample should be used 
to validate results. The plates are then washed three times with deionized or distilled 
water to remove unbounded polypeptide. 

Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
30 The plates are again washed three times with deionized or distilled water to remove 
unbounded conjugate. 
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Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate 1 hour at room 
..temperature. Measure the reaction by a microliter plate reader Prepare a standard 
xurve, using serial dilutions of a control sample, and plot polypeptide concentration 
on the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 
Interpolate the concentration of the polypeptide in the sample using the standard 
tcurve. 

Example 23: Formulating a Polypeptide 

The secreted polypeptide composition will be formulated and dosed in a 
fashion consistent with good medical practice, taking into account the clinical 
condition of the individual patient (especially the side effects of treatment with the 
secreted polypeptide alone), the site of delivery, the method of administration, the 
scheduling of administration, and other factors known to practitioners. The "effective 
amount" for purposes herein is thus determined by such considerations. 

As a general proposition, the total pharmaceutically effective amount of 
secreted polypeptide administered parenterally per dose will be in the range of about 1 
|ig/kg/day to 10 mg/kg/day of patient body weight, although, as noted above, this will 
be subject to therapeutic discretion. More preferably, this dose is at least 0.01 
mg/kg/day. and most preferably for humans between about 0.01 and 1 mg/kg/day for 
the hormone. If given continuously, the secreted polypeptide is typically 
administered at a dose rate of about 1 ^g/kg/hour to about 50 |ig/kg/hour, either by 1- 
4 injections per day or by continuous subcutaneous infusions, for example, using a 
mini-pump. An intravenous bag solution may also be employed. The length of 
treatment needed to obser\^e changes and the interval following treatment for 
responses to occur appears to vary depending on the desired effect. 

Pharmaceutical compositions containmg the secreted protein of the invention 
are administered orally, rectally, parenterally. intracistemally. iniravaginally, 
intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucally, or as an oral or nasal spray. "Pharmaceutically acceptable carrier" 
refers to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to 
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modes of administration which include intravenous, intramuscular intraperitoneal, 
intrastemal, subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of sustained-release compositions include semi- 
5 permeable polymer matrices in the form of shaped articles, e.g., films, or 
mirocapsules. Sustained-release matrices include polylactides (U.S. Pat. No. 
3,773,919, EP 58,481), copolymers of L-glutamic acid and gamma-ethyl-L-glutamate 
(Sidman, U. et al., Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyl 
methacrylate) (R. Langer et al., J. Biomed. Mater. Res. 15:167-277 (1981). and R. 

10 Langer, Chem. Tech. 12:98-105 ( 1982)), ethylene vinyl acetate (R. Langer et al.) or 
poly-D- (-)-3-hydroxybutyric acid (EP 133,988). Sustained-release compositions 
also include liposomally entrapped polypeptides. Liposomes containing the secreted 
polypeptide are prepared by methods known per se: DE 3.218,121; Epstein et al., 
Proc. Natl. Acad. Sci. USA 82:3688-3692 (1985): Hwang et al., Proc. Natl. Acad. Sci. ... 

15 USA 77:4030-4034 (1980); EP 52,322: EP 36,676; EP 88.046; EP 143,949; EP 

142,641; Japanese Pal. Appl. 83-1 18008; U.S. Pat. Nos. 4,485,045 and 4,544,545; and 
EP 102,324. Ordinarily, the liposomes are of the small (about 200-800 Angstroms) 
unilamellar type in which the lipid content is greater than about 30 mol. percent 
cholesterol, the selected proportion being adjusted for the optimal secreted 

20 polypeptide therapy. 

For parenteral administration, in one embodiment, the secreted polypeptide is 
formulated generally by mixing it at the desired degree of purity, in a unit dosage 
injectable form (solution, suspension, or emulsion), with a phannaceutically 
acceptable carrier, i.e., one that is non-toxic to recipients at the dosages and 

25 concentrations employed and is compatible with other ingredients of the formulation. 
For example, the formulation preferably does not include oxidizing agents and other 
compounds that are Icnown to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 

30 Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 
of the recipient. Examples of such carrier vehicles include water, saline. Ringers 
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;solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 

The carrier suitably contains minor amounts of additives such as substances 
that enhance isotonicity and chemical stability. Such materials are non-toxic to 
recipients at the dosages and concentrations employed, and include buffers such as 
phosphate, citrate, succinate, acetic acid, and other organic acids or their salts; 
-antioxidants such as ascorbic acid; low moleculai- weight (less than about ten 
-residues) polypeptides, e.g., polyarginine or tripeptides: proteins, such as serum 
.albumin, gelatin, or immunoglobulins: hydrophilic polymers such as 
./polyvinylpyrrolidone: amino acids, such as glycine, glutamic acid, aspartic acid, or 
arginine: monosaccharides, disaccharides, and other carbohydrates including cellulose 
or its derivatives, glucose, manose. or dextrins; chelating agents such as EDTA; sugar 
alcohols such as mannitol or sorbitol: counterions such as sodium; and/or nonionic 
surfactants .such as polysorbaies, poloxamers. or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/ml to 100 mg/mJ, preferably I -10 mg/mL at a pH of 
about 3 to 8. It will be understood that the use of certain of the foregoing excipienis, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile; 
Sterility is readily accomplished by filtration through sterile filtration membranes 
(e.g., 0.2 micron membranes). Therapeutic polypeptide compositions generally are 
placed into a container having a sterile access port, for example, an intravenous 
solution bag or vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation. 10-ml 
vials are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, 
and the resulting mixture is lyophilized. The infusion solution is prepared by 
reconstituting the lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such container(s) can be a notice in 
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the form prescribed by a governmental agency regulating the manufacture, use or sale 
of pharmaceuticals or biological products, which notice reflects approval by the 
agency of manufacture, use or sale for human administration. In addition, the 
polypeptides of the present invention may be employed in conjunction with other 
therapeutic compounds. 

Example 24: Method of Treating Decreased Levels of the Polypeptide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 
administering the polypeptide of the present invention, preferably in the secreted 
form. Thus, the invention also provides a method of treatment of an individual in 
need of an increased level of the polypeptide comprising administering to such an 
individual' a pharmaceutical composition comprising an amount of the polypeptide to 
increase the activity level of the polypeptide in such an individual. 

For example, a patient with decreased levels of a polypeptide receives a daily 
dose 0. 1 - 100 ug/kg of the polypeptide for six consecutive days. Preferably, the 
polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypeptide 

Antisense technology is used. to inhibit production of a polypeptide of the 
present invention. This technology is one example of a method of decreasing levels 
of a polypeptide, preferably a secreted form, due to a variety of etiologies, such as 
cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1,5, 
2.0 and 3.0 mg/kg day for 2 1 days. This treatment is repeated after a 7-day rest 
period if the treatment was well tolerated. The formulation of the antisense 
polynucleotide is provided in Example 23, 

Example 26: Method of Treatment Using Gene Therapy 
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One method of gene therapy transplants fibroblasts, which are capable of 
expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
.subject by skin biopsy. The resulting tissue is placed in tissue-culture medium and 
^separated into small pieces. Small chunks of the tissue are placed on a wet surface of 
a tissue culture flask, approximately ten pieces are placed in each flask. The flask is 
turned upside down, closed tight and left at room temperature over night. After 24 
. hours at room temperature, the flask is inverted and the chunks of tissue remain fixed 
io the bottom of the flask and fresh media (e.g.. Ham's F12 media, with 10% FBS, 
penicillin and streptomycin) is added. The flasks are then incubated at 3TC for 
approximately one week. 

At this time, fresh media is added and subsequently changed every several 
days. After an additional two weeks in culture, a monolayer of fibroblasts emerge. 
The monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier, P.T. ei al., DNA, 7:219-25 (1988)), flanked by the long 
terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindlll and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PCR primers which correspond to the 5' and 3' end sequences respectively as set 
forth in Example 1. Preferably, the 5' primer contains an EcoRI site and the 3' primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma virus linear 
backbone and the amplified EcoRI and Hindlll fragment are added together, in the 
presence of T4 DNA ligase. The resulting mixture is maintained under conditions 
appropriate for ligation of the two fragments. The ligation mixture is then used to 
transform bacteria HBlOl, which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly insened. 

The amphotropic pA317 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco s Modified Eagles Medium (DMEM) with 
10% calf serum (CS), penicillin and streptomycin. The MSV vector containing the 
gene is then added to the media and the packaging cells transduced with the vector. 
The packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 
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Fresh media is added to the transduced producer cells, and subsequently, the 
media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the 
media from the producer cells. This media is removed and replaced with fresh media. 
If the titer of virus is high» then virtually all fibroblasts will be infected and no 
selection is required. If the titer is very low, then it is necessary to use a retroviral 
vector that has a selectable marker, such as neo or his. Once the fibroblasts have been 
efficiently infected, the fibroblasts are analyzed to determine whether protein is 
produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
after having been grown to confluence on cytodex 3 microcarrier beads. 



Example 27; Method of Treatment Using Gene Theranv - In Vivo 

Another aspect of the present invention is using in vivo gene therapy methods 
to treat disorders, diseases and conditions. The gene therapy method relates to the 
introduction of naked nucleic acid (DNA. RNA. and antisense DNA or RNA) 
sequences into an animal to increase or decrease the expression of the polypeptide. 
The polynucleotide of the present invention may be operatively linked to a promoter 
or any other genetic elements necessary for the expression of the polypeptide by the 
target tissue. Such gene therapy and deliver}' techniques and methods are known in 
the an. see, for example, WO90/11092, W098/11779; U.S. Patent NO. 5693622, 
5705151, 5580859; Tabata H. et al. (1997) Cardiovasc. Res. 35(3):470-479, Chao J et 
al. (1997) Pharmacol. Res. 35(6):5 17-522, Wolff J.A. (1997) Neuromuscul. Disord. 
7(5):3 14-318. Schwartz B. et al. (1996) Gene Ther. 3(5):405-41 1, Tsurumi Y. et al. 
(1996) Circulation 94(12):328 1-3290 (incorporated herein by reference). 

The polynucleotide constructs may be delivered by any method that delivers 
injectable materials to the cells of an animal, such as. injection into the interstitial 
space of tissues (heart, muscle, skin, lung, liver, intestine and the like). The 
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polynucleotide constructs can be delivered in a pharmaceutically acceptable liquid or 
aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA, refers to sequences that are 
free from any delivery vehicle that acts to assist: promote, or facilitate entry into the 
cell, including viral sequences, viral particles, liposome formulations, lipofectin or 
precipitating agents and the like. However, the polynucleotides of the present 
invention may also be delivered in liposome forrnulations (such as those taught in 
Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772:126-139 and Abdallah B. et al. 
(1995) Biol. Cell 85(1): 1-7) which can be prepared by methods well known to those 
Skilled in the art. 

.. The polynucleotide vector constructs used in the gene therapy method are 
preferably constructs that will not integrate into the host genome nor will they contain 
sequences that allow for replication. Any strong promoter known to those skilled in 
the art can be used for driving the expression of DNA. Unlike other gene therapies 
techniques, one major advantage of introducing naked nucleic acid sequences into 
target cells is the transitory nature of the polynucleotide synthesis in the cells. Studies 
have shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months. 

The polynucleotide construct can be delivered to the interstitial space of 
tiksues within the an animal, including of rhuscle, skin, brain, lung, liver, spleen, bone 
marrow, thymus, heart, lymph, blood, bone, cartilage, pancreas, kidney, gall bladder, 
stomach, intestine, testis, ovary, uterus, rectum, nervous system, eye, gland, and 
connective tissue. Interstitial space of the tissues comprises the intercellular fluid, 
mucopolysaccharide matri.x among the reticular fibers of organ tissues, elastic fibers 
in the walls of vessels or chambers, collagen fibers of fibrous tissues, or that same 
matrix within connective tissue ensheathine muscle cells or in the lacunae of bone. It 
is similarly the space occupied by the plasma of the circulation and the lymph fluid of 
the lymphatic channels. Delivery to the interstitial space of muscle tissue is preferred 
for the reasons discussed below. They may be conveniently delivered by injection 
into the tissues comprising these cells. They are preferably delivered to and 
expressed in persistent, non-dividing cells which are differentiated, although delivery 
and expression may be achieved in non-differentiated or less completely 
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differentiated cells, such as. for example, stem ceils of blood or skin fibroblasts. In 
vivo muscle cells are particularly competent in their ability to take up and express 
polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of DNA or 
RiNA will be in the range of from about 0.05 g/kg body weight to about 50 mg/kg 
body weight. Preferably the dosage will be from about 0.005 mg/kg to about 20 
mg/kg and more preferably from about 0.05 mg/kg to about 5 mg/kg. Qf course, as 
the artisan of ordinary skill will appreciate, this dosage will vary according to the 
tissue site of injection. The appropriate and effective dosage of nucleic acid sequence 
can readily be determined by those of ordinar>' skill in the art and may depend on the 
condition being treated and the route of administration. The preferred route of 
administration is by the parenteral route of injection into the interstitial space of 
tissues. However, other parenteral routes may also be used, such as. . inhalation of an 
aerosol formulation particularly for delivery to lungs or bronchial tissues, throat or 
mucous membranes of the nose. In addition, naked polynucleotide constructs can be 
delivered to arteries during angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 
determined as follows. Suitable template DN.A for production of mRNA coding for 
polypeptide of the present invention is prepared in accordance with a standard 
recombinant DNA methodology. The template DNA. which may be either circular or 
linear, is either used as naked DNA or complexed with liposomes. The quadriceps 
muscles of mice are then injected with various amounts of the template DNA. 

Five to six week old female and male Balb/C mice are anesthetized by 
intraperitoneal injection with 0.3 ml of 2.5% Avertin. A 1.5 cm incision is made on 
the anterior thigh, and the quadriceps muscle is directly visualized. The template 
DNA is injected in 0.1 ml of carrier in a 1 cc syringe through a 27 gauge needle over 
one minute, approximately 0.5 cm from the distal insertion site of the muscle into the 
knee and about 0.2 cm deep. A suture is placed over the injection site for future 
localization, and the skin is closed with stainless steel clips. 

After an appropriate incubation time (e.g., 7 days) muscle extracts are 
prepared by excising the entire quadriceps. Every fifth 15 urn cross-section of the 
individual quadriceps muscles is histochemically stained for protein expression. A 



wo 00/06698 



352 



PCT/US99/17130 



time course for protein expression may be done in a similar fashion except that 
quadriceps from different mice are harvested at different times. Persistence of DNA 
in muscle following injection may be determined by Southern blot analysis after 
preparing total cellular DNA and HIRT supematants from injected and control mice. 
The results of the above experimentation in mice can be use to extrapolate proper 
dosages and other treatment parameters in humans and other animals using naked 
DNA. 

Example 28; Transgenic Animals. 

The polypeptides of the invention can also be expressed in transgenic animals. 
Animals of any species, including, but not limited to. mice, rats, rabbits, hamsters, 
guinea pigs, pigs, micro-pigs, goats, sheep, cows and non-human primates, e.g., 
baboons, monkeys, and chimpanzees may be used to generate transgenic animals. In a 
specific embodiment, techniques described herein or otherwise known in the art, are 
used to express polypeptides of the invention in humans, as part of a gene therapy 
protocol. 

Any technique known in the art may be used to introduce the transgene (i.e., 
polynucleotides of the invention) into animals to produce the founder lines of 
transgenic animals. Such techniques include, but are not limited to, pronuclear 
microinjection (Paterson et al.. Appl. Microbiol. Biotechnol. 40:691-698 (1994); 
. Carver et al.. Biotechnology (NY) 1 1 : 1263-1 270 (1993); Wright et al., Biotechnology 
(NY) 9:830-834 (1991): and Hoppe et al.. U.S. Pat. No. 4,873,191 (1989)); retrovirus 
mediated gene transfer into germ lines (Van der Putten et al., Proc. Natl. Acad. Sci., 
USA 82:6148-6152 (1985)), blastocysts or embr\'os; gene targeting in embryonic 
stem cells (Thompson et al.. Cell 56:313-321 (1989)); electroporaiion of cells or 
embryos (Lo, 1983, Mol Cell. Biol. 3:1803-1814 (1983)): introduction of the 
polynucleotides of the invention using a gene gun (see, e.g.. Ulmer et al.. Science 
259:1745 (1993); introducing nucleic acid constructs into embryonic pleuripotent 
stem cells and transferring the stem cells back into the blastocyst; and sperm- 
mediated gene transfer (Lavitrano et ah. Cell 57:717-723 (1989): etc. For a review of 
such techniques, see Gordon. "Transgenic Animals," Intl. Rev. Cytol. 115:171-229 
(1989), which is incorporated by reference herein in its entiret\'. 
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Any technique known in the art may be used to produce transgenic clones 
containing polynucleotides of the invention, for example, nuclear transfer into 
enucleated oocytes of nuclei from cultured embryonic, fetah or adult cells induced to 
quiescence (Campell et al.. Nature 380:64-66 (1996); Wilmut et al.. Nature 385:810- 
813 (1997)). 

The present invention provides for transgenic animals that carry the transgene 
in all their cells, as well as animals u'hich cany the transgene in some, but not all their 
cells, i.e.. mosaic animals or chimeric. The transgene may be integrated as a single 
transgene or as multiple copies such as in concatamers, e.g., head-to-head tandems or 
head-to-tail tandems. The transgene may also be selectively introduced into and 
activated in a particular cell type by following, for example, the teaching of Lasko et 
al. (Lasko et al., Proc. Natl. Acad. Sci. USA 89:6232-6236 (1 992)). The regulatory 
sequences required for such a cell-type specific activation will depend upon the 
particular cell type of interest, and will be apparent to those of skill in the art. When 
. it is desired that the polynucleotide transgene be integrated into the chromosomal site 
of the endogenous gene, gene targeting is preferred. Briefly, when such a technique is 
to be utilized, vectors containing some nucleotide sequences homologous to the 
endogenous gene are designed for the purpose of integrating, via homologous 
recombination with chromosomal sequences, into and disrupting the function of the 
nucleotide sequence of the endogenous gene. The transgene may also be selectively 
introduced into a particular cell type, thus inactivating the endogenous gene in only 
that cell type, by following, for example, the teaching of Gu et al. (Gu et al., Science 
265:103-106 (1994)). The regulator}- sequences required for such a cell-t>'pe specific 
inactivation will depend upon the particular cell t>'pe of interest, and will be apparent 
to those of skill in the art. 

Once transgenic animals have been generated, the expression of the 
recombinant gene may be assayed utilizing standard techniques. Initial screening 
may be accomplished by Southern blot analysis or.PCR techniques to analyze animal 
tissues to verify that integration of the transgene has taken place. The level of mRNA 
expression of the transgene in the tissues of the transgenic animals may also be 
assessed using techniques which include, but are not limited to. Northern blot analysis 
of tissue samples obtained from the animal, in situ hybridization analysis, and reverse 
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transcriptase-PCR (rt-PCR). Samples of transgenic gene-expressing tissue may also 
be evaluated immunocytochemically or immunohistochemically using antibodies 
•specific for the transgene product. 

* Once the founder animals are produced, they may be bred, inbred, outbred, or 
5 crossbred to produce colonies of the particular animal. Examples of such breeding 
strategies include, but are not limited to: outbreeding of founder animals with more 
'^than one integration site in order to establish separate lines; inbreeding of separate 
"lines in order to produce compound transgenics that express the transgene at higher 
levels because of the effects of additive expression of each transgene; crossing of 
10 -heterozygous transgenic animals to produce animals homozygous for a given 
'integration site in order to both augment expression and eliminate the need for 
-screening of animals by DNA analysis; crossing of separate homozygous lines to 
produce compound heterozygous or homozj'gous lines; and breeding to place the 
transgene on a distinct background that is appropriate for an experimental model of 
15 'interest. 

Transgenic animals of the invention have uses which include, but are not 
limited to, animal model systems useful in elaborating the biological function of 
polypeptides of the present invention, studying conditions and/or disorders associated 
%ith aberrant expression, and in screening for compounds effective in ameliorating 
20 such conditions and/or disorders. 

Example 29: Knock-Out Animals. 

Endogenous gene expression can also be reduced by inactivating or "knocking 
out" the gene and/or its promoter using targeted homologous recombination. (E.g., 

25 see Smithies et al.. Nature 317:230-234 (1985); Thomas & Capecchi, Cell 5 1:503- 
512 (1987); Thompson et al.. Cell 5:313-321 (1989): each of which is incorporated by 
reference herein in its entirety). For example, a mutant, non-functional 
polynucleotide of the invention (or a completely unrelated DNA sequence) flanked by 
DNA homologous to the endogenous polynucleotide sequence (either the coding 

30 regions or regulator)' regions of the gene) can be used, with or without a selectable 
marker and/or a negative selectable marker, to transfect cells that express 
polypeptides of the invention in vivo. In another embodiment, techniques known in 
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the art are used to generate knockouts in cells that contain, but do not express the gene 
of interest. Insertion of the DNA construct, via targeted homologous recombination, 
results in inactivation of the targeted gene. Such approaches are particularly suited in 
research and agricultural fields where modifications to embr>'onic stem cells can be 
5 used to generate animal offspring with an inactive targeted gene {e,g., see Thomas & 
Capecchi 1987 and Thompson 1989, supra). However this approach can be routinely 
adapted for use in humans provided the recombinant DNA constructs are directly 
administered or targeted to the required site in vivo using appropriate viral vectors that 
will be apparent to those of skill in the art. 

10 In further embodiments of the invention, cells that are genetically engineered 

to express the polypeptides of the invention, or alternatively, that are genetically 
engineered not to express the polypeptides of the invention (e.g., knockouts) are 
administered to a patient in vivo. Such cells may be obtained from the patient (i.e., 
animal, including human) or an MHC compatible donor and can include, but are not 

15 limited to fibroblasts, bone marrow cells, blood cells (e^, lymphocytes), adipocytes, 
muscle cells, endothelial cells etc. The cells are genetically engineered in vitro using 
recombinant DNA techniques to introduce the coding sequence of polypeptides of the 
invention into the cells, or alternatively, to disrupt the coding sequence and/or 
endogenous regulatory sequence associated with the polypeptides of the invention, 

20 e^, by transduction (using viral vectors, and preferably vectors that integrate the 
transgene into the cell genome) or iransfection procedures, including, but not limited 
to, the use of plasmids, cosmids, YACs, naked DNA. electroporation. liposomes, etc. 
The coding sequence of the polypeptides of the invention can be placed under the 
control of a strong constitutive or inducible promoter or promoter/enhancer to achieve 

25 expression, and preferably secretion, of the polypeptides of the invention. The 
engineered cells which e.xpress and preferably secrete the polypeptides of the 
invention can be introduced into the patient systemically, e.g.. in the circulation, or 
intraperiloneally. 

Alternatively, the cells can be incorporated into a matrix and implanted' in the 
30 body, e.g. , genetically engineered fibroblasts can be implanted as part of a skin graft; 
genetically engineered endothelial cells can be implanted as part of a lymphatic or 
vascular graft. (See, for example, Anderson et al. U.S. Patent No. 5.399,349: and 
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Mulligan & Wilson. U.S. Patent No. 5,460,959 each of which is incorporaied by 
reference herein in its entirety). 

When the cells to be administered are non-autologous or non-MHC 
compatible cells, they can be administered using well known techniques which 
5 prevent the development of a host immune response against the introduced cells. For 
example, the cells may be introduced in an encapsulated form which, while allowing 
^ for an exchange of components with the immediate extracellular environment, does 
' not allow the introduced cells to be recognized by the host immune system. 

Transgenic and "knock-out'' animals of the invention have uses which include, 
10 " but are not limited to, animal model systems useful in elaborating the biological 
function of polypeptides of the present invention, studying conditions and/or disorders 
associated with aberrant expression, and in screening for compounds effective in 
ameliorating such conditions and/or disorders. 

15 It will be clear that the invention may be practiced otherwise-than as 

1 particularly described in the foregoing description and examples. Numerous 

modifications and variations of the present invention are possible in light of the above 
teachings and, therefore, are within the scope of the appended claims. 

The entire disclosure of each document cited (including patents, patent 

20 applications, journal articles, abstracts, laborator>' manuals, books, or other 

disclosures) in the Background of the Invention, Detailed Description, and Examples 
is hereby incorporated herein by reference. Further, the hard copy of the sequence 
listing submitted herewith and the corresponding computer readable form are both 
incorporated herein by reference in their entireties. 
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What Is Claimed Is: 

1. An isolated nucleic acid molecule comprising a polynucleotide having 
a nucleotide sequence at least 95% identical to a sequence selected from the group 
consisting of: 

(a) a polynucleotide fragment of SEQ ID NO:X or a polynucleotide fragment 
. of the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to 

SEQ ID NO:X; 

(b) a polynucleotide encoding a polypeptide fragment of SEQ ID NO: Y or a 
polypeptide fragment encoded by the cDNA sequence included in ATCC Deposit 
No:Z, which is hybridizable to SEQ ID NO:X; 

(c) a polynucleotide encoding a polypeptide domain of SEQ ID NO: Y or a 
polypeptide domain encoded by the cDNA sequence included in ATCC Deposit 
.No:Z, which is hybridizable to SEQ ID NO:X; 

(d) a polynucleotide encoding a polypeptide epitope of SEQ ID NO: Y or a 
polypeptide epitope encoded by the cDNA sequence included in ATCC Deposit 
No:Z, which is hybridizable to SEQ ID NO:X; 

(e) a polynucleotide encoding a polypeptide of SEQ ID NO: Y or the cDNA 
sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X, 
having biological activity; 

(f) a polynucleotide which is a variant of SEQ ID xNO:X; 

(g) a polynucleotide which is an allelic variant of SEQ ID NO:X; 

(h) a polynucleotide which encodes a species homologue of the SEQ ID 

NO:Y; 

(i) a polynucleotide capable of hybridizing under stringent conditions to any 
one of the polynucleotides specified in (a)-(h), wherein said polynucleotide does not 
hybridize under stringent conditions to a nucleic acid molecule having a nucleotide 
sequence of only A residues or of only T residues. 

2. The isolated nucleic acid molecule of claim 1. wherein the 
polynucleotide fragment comprises a nucleotide sequence encoding a secreted 
protein. 
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3. The isolated nucleic acid molecule of claim 1, wherein the 
polynucleotide fragment comprises a nucleotide sequence encoding the sequence 
identified as SEQ ID NO:Y or the polypeptide encoded by the cDNA sequence 
included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X. 

4. The isolated nucleic acid molecule.of claim 1, wherein the 
polynucleotide fragment comprises the entire nucleotide sequence of SEQ ID NO:X 
or the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to 
SEQIDNOrX. 

» 

5. The isolated nucleic acid molecule of claim 2, wherein the nucleotide 
sequence comprises sequential nucleotide deletions from either the C-terminus or the 
N-terminus. 

6. The isolated nucleic acid molecule of claim 3. wherein the nucleotide ' 
sequence comprises sequential nucleotide deletions from either the G-terminus or the 
N-terminus. 

7. A recombinant vector comprising the isolated nucleic acid molecule of 
claim 1. 

8. A method of making a recombinant host cell comprising the isolated 
nucleic acid molecule of claim 1. 

9. - A recombinant host cell produced by the method of claim 8. 

10. The recombinant host cell of claim 9 comprising vector sequences. 

11. An isolated polypeptide comprising an amino acid sequence at least 
95% identical to a sequence selected from the group consisting of: 
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(a) a polypeptide fragment of SEQ ID NO: Y or the encoded sequence 
included in ATCC Deposit No:Z; 

(b) a polypeptide fragment of SEQ ID NO: Y or the encoded sequence 
included in ATCC Deposit No:Z, having biological activity; 

(c) a polypeptide domain of SEQ ID NO: Y or the encoded sequence included 
in ATCC Deposit No:Z; 

(d) a polypeptide epitope of SEQ ID NO: Y or the encoded sequence included 
in ATCC Deposit No:Z; 

(e) a secreted form of SEQ ID NO:Y or the encoded sequence included in 
ATCC Deposit No:Z; 

(f) a full length protein of SEQ ID NO:Y or the encoded sequence included in 
ATCC Deposit No:Z: 

(g) a variant of SEQ ID NO: Y; 

(h) an allelic variant of SEQ ID NO: Y; or 

(i) a species homologue of the SEQ ID NO: Y. 

12. The isolated polypeptide of claim 1 1, wherein the secreted form or the 
full length protein comprises sequential amino acid deletions from either the C- 
terminus or the N-terminus. 

13. An isolated antibody that binds specifically to the isolated polypeptide 
of claim 11. 

14. A recombinant host cell that expresses the isolated polypeptide of 
claim 1 1. 

15. A method of making an isolated polypeptide comprising: 

(a) culturing the recombinant host cell of claim 14 under conditions such that 
said polypeptide is expressed: and 

(b) recovering said polypeptide. 



16. The polypeptide produced by claim 15. 
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17. A method for preventing, treating, or ameliorating a medical condition, 
comprising administering to a mammalian subject a therapeutically effective amount 
of the polypeptide of claim 1 1 or the polynucleotide of claim 1. 

18. A method of diagnosing a pathological condition or a susceptibility to 
a pathological condition in a subject comprising: 

(a) determining the presence or absence of a mutation in the polynucleotide of 
claim 1; and 

(b) diagnosing a pathological condition or a susceptibility to a pathological 
condition based on the presence or absence of said mutation. 

19. A method of diagnosing a pathological condition or a susceptibility to 
a pathological condition in a subject comprising: 

(a) determining the presence or amount of expression of the polypeptide of 
claim 1 1 in a biological sample: and 

(b) diagnosing a pathological condition or a susceptibility to a pathological 
condition based on the presence or amount of expression of the polypeptide. 

(- 

20. A method for identifying a binding partner to the polypeptide of claim 
1 1 comprising: 

(a) contacting the polypeptide of claim 1 1 with a binding partner; and 

(b) determining whether the binding partner effects an activity of the 
polypeptide. 

21. The gene corresponding to the cD.N'A .sequence of SEQ ID NO: Y. 

22. A method of identifying an activity in a biological assay, wherein the 
method comprises: 

(a) expressing SEQ ID NO:X in a cell: 

(b) isolating the supernatant: 

(c) detecting an activity in a biological assay: and 

(d) identifying the protein in the supernatant having the activity. 
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23. The product produced by the method of claim 20. 
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<110> Human Genome Sciences. Inc. 

<120> 98 Human Secreted Proceins 

<130> PZ031.PCT 

<140> Unassigned 
<141> 1999-07-28 

<150> 60/094,657 
<151> 1998-07-30 

<150> 60/095,486 
<151> 1998-08-05 

<150> 60/095,455 
<151> 1998-08-06 

<150> 60/095,454 
<151> 1998-08-06 

<150> 60/096,319 
<151> 1998-08-12 

<160> 364 

<170> Patentin Ver . 2.0 



<210> 1 

<211> 733 

<212> DNA 

<213> Homo sapiens 



<400> 1 

gggatccgga gcccaaa.ct cctgacaaaa ctcacacacg cccaccgccc ccagcaccrg 60 

aactcgaggg cgcaccgcca gzczzcczcz zccccccaaa acccaacgac acccccacga 120 

tcccccggac ccctgaggcc acacgcgugg cggcggacgc aagccacgaa gaccccgagg 180 

ccaagttcaa ccggtacgcg gacgccgcgg aggcgcaiaa cgccaagaca aagccgcggg 240 

aggagcagca caacagcacg caccgcgcgg ccagcgticct caccgccctg caccaggact 300 

ggctgaatgg caaggagtac aagcgcaagg cctccaacaa agccctccca acccccatcg 360 

agaaaaccat ccccaaagcc aaagggcagc cccgagaacc acaggtgtac accccgcccc 420 

catcccggga tgagccgacc aagaaccagg ccagcctgac ctgcccggtc aaaggcrcct 480 

atccaagcga caccgccgtg gagcgggaga gcaacgggca gccggagaac aactacaaga 540 

ccacgcctcc cgtgccggac rccgacggct cc^rcctcc- cracagcaag ctcaccgtgg 600 

acaagagcag gtggcagcag gggaacgtcc ccrcacgccc cgcgatgcac gaggctccgc 660 

acaaccacta cacgcagaag agcczczccc zgzczccggg Caaacgagrg cgacggccgc 720 

gactccagag gat: 733 



<210> 2 * 

<211> 5 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> Sice 
<222> (3) 
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<223> Xaa equals any of the cwency naturally ocurring L-amino acids 
<400> 2 

Trp Ser Xaa Trp Ser 
1 5 

<210> 3 
<211> 86 
<212> DNA 

<213> Komo sapiens 

<400> 3 

gcgcctcgag atttccccga aatctagatt tccccgaaac gacttccccg aaacgacttc 60 
cccgaaatat ctgccatctc aatcag 86 



<210> 4 

<211> 27 

<212> DNA 

<213> Komo sapiens 

<400> 4 

gcggcaagct ttttgcaaag cctaggc 



<210> 5 
<211> 271 

<212> DNA 

<213> Homo sapiens 



<400> 5 

ctcgagactt ccccgaaatc tagatctccc cgaaacgatt tccccgaaat gatttccccg 60 

aaatatctgc catctcaatt agtcagcaac catagtcccg cccctaactc cgcccatccc 120 

gcccctaact ccgcccagtc ccgcccattc tccgcccccir ggctgactaa ttttttctat 180 

ttatgcagag gccgaggccg cctcggcctc tgagctattc cagaagtagt gaggaggctt 240 

tttcggaggc ctaggctttt gcaaaaagct t 271 



<210> 6 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 6 

gcgctcgagg gatgacagcg atagaacccc gg 32 



<210> 7 
<211> 31 
<212> DNA 

<213> Homo sapiens 

<400> 7 

gcgaagcttc gcgactcccc ggatccgcct c 31 



<210> 
<211> 
<212> 



8 

12 
DNA 
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<213> Homo sapiens 



<400> 8 
ggggactttc cc 



12 



<210> 9 
<21i> 73 
<212> DNA 

<213> Homo sapiens 

<400> 9 

gcggccccga ggggacttcc ccggggacct eccggggact ccccgggacc ttccaccctg 60 
ccatctcaac cag 73 

<210> 10 
<211> 256 
<212> DNA 

<213> Homo sapiens 
<400> 10 

ctcgagggga ccttcccggg gacctcccgg ggaccttccg ggaccttcca cccgccacct 60 
caattagtca gcaaccatag tcccgcccct: aaczccgccc accccgcccc caactccgcc 120 
cagttccgcc catcctccgc cccatggccg actaacctcr cccacccatg cagaggccga 180 
ggccgcctcg gcccctgagc tattccagaa gtagtgagga ggcttctttg gaggcctagg 240 
ctttcgcaaa aagctt 256 



<210> 11 

<211> 1564 

<212> DNA 

<213> Homo sapiens 



<400> 11 



gcggacggcg ggccgcgcaa ccttcctccc t ttcttaaac 
cttcccttcc accgccgcct gccgcccgca tctgccrcct: 
acgtcaaacg gggaaccccg gcacaagana ctttcaccia 
tcgcggccac cgrtgcaggc atitgttagac ccggccaggg 
attcccccga gggtccacca ctcgcagcgg gtgacaccgc 
tgtcctccta Cwcaggczgg gacaccccca actacgccac 
agaggaacct gccccccccc accggcacct ccacgcccac 
tgaccaatgt ggcctattat accgtgcrag acacgagaga 
ttgctgtgac tctitgcagat: cagatattcg gaacacttaa 
tcgcatcatc ccgttctggc ggcctcaatg cctccaccgt 
ttgcgggctc aagagaaggc catccccccg acgccatccg 
tcacaccagc gccttctctig ccctccaarg gtaccatggc 
aagacaccuC ccagcwcatic aaccaccaca gctccagcta 
ctatcgtggg tcagctttac ctgcgccgga aggagcccga 
Ccagcgtttt cttcccga-t gtcttctgcc cctgcaccac 
ttcacagcga cactatcaac ccccccatcg gcatrgccac 
ttcactcccc catcatcaga gtgccagaac acaagcgacc 
ggggcccgcc acaaggtacc tccaggccct gtgcacgcca 
ggaagatgga ggagagacgc ccaagcaacg ggaccccaag 
aatcctgatg cggaaagcag gggtttctgg tctaccggct 
ggaaagctca cctccccgga ggcacccgtic cagaagcccg 
cgaacctact ccttgaaacg aaaagcaatt catctgttct 
tttaaagggg acaatgaagg cgactgtiggg gaggagcacg 
cttagaagca cctgggtgcg cccacctact: cctccctcct 



gcr:cggggca 

aacczzcazc 

::gcLaaagt:a 

agcczzzacz 

cctggcacrg 

cgaagagarc 

tgccaccarc 

cater cggcc 

ccggacaatc 

ggcrgcCuCt 

cacgatccac 

accgacctac 

ccggutcttt: 

zcgacczcgz 

czzcczggzg 

tgccctctca 

gcttcacctc 

gctgctgcag 

tctaactaaa 

agagccaagg 

gcccaggcag 

gctacacact 

ccaggnctgg 

tctaaaaggg 



tttgtctggc 
aaccgcgcct 
i-ggcaccga 
caztctgaga 
cactcagccc 
aagaatcctg 
arccatacct: 
agtgatgccg 
ccaccgtcag 
aggctttccL 
gr::gagcgg- 
rrgcgcgtgg 
gcggggctcc 
ccccccaagc 
gcrgtcccac 
ggcctgcccc 
cgaagatcgt 
aaatggacc" 
caccacctgg 
aagctgaaaa 
ccrcaaccwC 
gccccagact 
gcctggttgt 
cccacaatgc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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tccaattccc tgrctccttt agagagacat gaaactacca caggcgctgg acgacaataa 1500 
aagtctatgt tcctaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaagggcggc 1560 
cccg 1564 



<210> 12 

<211> 1757 

<212> DNA 

<213> Homo sapiens 



<400> 12 

ggcacgaggc agatgggttc ttggtgtgga cgtccnttct gttcgttagt tgcccttcca 60 

acagacagga ccctcagctg caggcctgtt ggagtaccct: gcaacgtgag gtgtcagtgt 120 

gcccctgctg gagggtgccc cccagctagg crgctcgcgg gccaggggcc agggacccac 180 

ttgaggaggc agtctgcccg tcctcagatc tccagccgca cgctgggaga accaccgctc 240 

tcttcaaagc cgtcagacag ggacatttaa gtctgcagag gttactgctg tcctcttgtt 300 

tgtctgtgcc crgcccccag aggtggagcc tacagaggca ggcaggcctc cttgagctgt 360 
ggtgggctcc acccagttcg agcttcccgg ctgctctgct tacctaagca agcctgggca * 420 

atggcgggcg cccctccccc agtcccgctg ccgccctgca gtttgatctic agactgctgt 480 

gctagcaatc agcgagactc cgcggggcag gaccctccga gccaggtggg ggacacaatc 540 

ctgtggtgcg ccactcctita agcccgtcgg aaaagcgcag cattcaggcg ggagtgaccc 600 

gattctcccg acrttccagg tgccgtccgt cacccctrtc ttcgartagg aaagggaact 660 

ccctgacccc tcgcgccccc cgagtgagyc aacgccccgc gctgctttgg ctcgtgcatg 720 

gtgcgtgcac ccactgacct gcgcccactg tctggcactc cctagtgaga tgaacccggt 780 

acctcagacg gaaatgcaga aatcacccgt cccccgcgcc gcccaggctg ggagctgtag 840 

accagagctg ttcctattcc gccaccccgg cccctccccc accccccgcc caaccctaag 900 

tttatctctt aacaaagtaa acactgcaca cgtacttaca cgttttaaaa tattgtttat 960 

ggagcactta ctacgcgttt cacggcccaa gcactccaca ttattcactg actctttata 1020 

gcaactctat aacatattaa tagaccgtct caactccacc gatgacaaaa ccggcagcct 1080 

agtaactaat gagtaacaac accagcccct aaaacctagg gtcctcccac tacagcacac 1140 

tctgactgac attgcccttg ttcagacaca gcactcttca cctagaaaag gacggagagc 1200 

tagaaagtgg ccctcttttt caagtcgttc ccatiucacca atattctarc aagatatatg 1260 

gaagagttat tttctaggtc ttccagaagg ggctgccgac acttgggaat tcaaaataat 1320 

taccccagag ggaaatgtgg gcacagaaac cttttaaaaa ccgaactttc ccgatctaat 1380 

atcggcttcc acacacagca cattacttct caaaaatcag ccagaaatgc acctgcagtt 1440 

ttatctactg gactttcctc tcccagcccc attcctccct aagtagcctt ttittccccat 1500 

tctataaaag aaaagcatac cccccaccta tagczzcccz cctraiggca cactatctca 1560 

ggggataatc rgaaatactg cccgagacac aacccccac:: aagaaataaa ctittcaggcc 1620 

gggtgcagtg gctcacacCt gcaaccccag aacctcggga ggccaaggcg ggggaggatc 1680 

cctgaaggcc aagagcccaa gaccagcccg gaacacarac caaga^cccc cctccacaaa 1740 

aaaaaaaaaa aaaaaaa 1757 



<210> 13 

<211> 1373 

<212> DNA 

<213> Homo sapiens 



<400> 13 

gtgaaatgaa acatgtacac caaaacacat aaacncaagc ttcataattt rctttccttg 60 

tggatcigga aaattctctc cccgctigttc ttcactctga cagttgcttt ggcnctccct 120 

attccctgcc mtccatatt ctgaaatgtt ctiggtcaaat: cttttcctgc cgagtgtgca 180 

gtatgctttg accgttaatt: tcactgacca ccaccccagt ggattatrca tictgaaactt 240 

ttctgtttcc cgctccggaa cgLCCCctgc tccggaagcc acatcticaac cgcataccga 300 

tccagcttta tttatcccaa gcactcttac aactatttta ctgataatat acaactaggt 360 

ataatcactc tagtcagtga aaccaaactg ataacacaaa acgacgctgc tagtaaacct 420 

agcctgaaaa ggaccagagg aagcaccttt tagtctaagc agtcaagacc aacaaactta 480 

gaacgagagc ctcgtattat tcagccttgc gacagaagga aaaagtacag ctatcgaatc 540 

tgtataccaa rctatgcaaa acacacacat catctcccca ttaacacagt agtatgacat 600 
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acagttttac caccaccgga ttcatgcatg cctaaaatga aaacgaaagg aacacacagg 660 

aaagcagctc cttatcttcc cagatacagc ctccaagagt gcccctgctg . tgcaccggag 720 

tgagtttgaa gaagttcagg taatccrctc ctaaccagca gcttaactta aagtgcatcc 780 

tcattctcgt cctaccacta aaagctrgct tattgacctc ctccnttcct gcctcctaaa 840 

trgtgactgg ctctcagcca gacctgcatc tgttgctatc tgcagcagca actgcaggat 900 

tcatctgctt ttcctcttigc agtgtatttc tgaatgcttt acaactctgg gtacgcacaa 960 

cgcacgtaca tgcatatcca tacggccaag agttttcaga actcnccgtc cgaataatga 1020 

acgtgtgtat tccctacccc tcgtcctgct ctgtcctaac atttaagctc tgcttatcaa 1080 

aacgtacttg ttctaatgcc ttctctttag attgtaagct cttcatgggc agtggtcaga 1140 

tattctgtat ttgtaactcc ccgagaatct: agtattttac cacgtacata atgaacatag 1200 

aatagtattg attggaagga cagcgaagca caaaatgaga ttcraaaatg acattttcat 1260 

tagggctcac aagtgtataa gaagtacccc ttccctattt ttttgtttaa cactttagaa 1320 

gctattacat taaaaaagct atccccatca aaaaaaaaaa aaaaaaaaaa aaa 1373 



<210> 14 

<211> 3740 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (957) 

<223> n equals a,t,g, or c 



<400> 14 

actctctaag gcttttttgg actcaccaaa caggcttctt gcagtggaga tgaatacaga '"66 
tcacttaagg ttgactgtgc caaatggcat aggggccctg aagctaaggg aaatggaaca 120 

ctacttccca cagggcctgt cagttcagct gtttaatgac gggtccaagg gcaaactcaa . .;180 
tcatttatgc ggagctgact ttgcgaaaag tcatcagaaa cctccacagg gaatggaaat 240 
taagtccaat gaaagatgcc gttccct;:ga cggagacgca gacagaactg cttatcacta 300 

ccatgatgca gatggccact ttcatcccat agatggagac aagatagcaa cgttaattag . 3.60 
cagtttcctt aaagagcccc tggcggagac tggagaaagt wCgaataccg gtgttgcaca 420 
aactgcatat gcaaatggaa gtccaacacg gtatcttgaa gaagtratga aggtacctgt 480 
ctattgcact aagactggtg caaaacatrt gcaccacaag gctcaagagt ttgacactgg 540 
agttcatttc gaagcaaatg ggcacggcac tgcactgttt agtacagctg ttgaaatgaa 600 
gataaaacaa tcagcagaac aaciggaaga caagaaaaga aaagc^igcta agacgcctga 660 

aaacattatt gacttgcLca accaggcagc cggcgatgct att-tctgaca cgctggtigat 720 

tgaagcaacc ttggctccga agggcctgac cgcacaacag tgggatgctic tcratacaga 780 
tcttccaaac agacaaczca aagctcaggt cgcagacacg agagtitatta gcactaccra 840 

tgctgaaaga caagcagtta cacccccagg actacaggag gcaaccaat:g acctggtgaa 900 

gaagtacaag cnctctcgag ctt"tgcccg gccctctggt acagaagatg tcgcccngag 960 

tatacgcaga agcagactca caagaaagcg cagaccacct tgcacatgaa gtgagcttgg 1020 

cagnatttca gcnggctgga ggaatcggag aaaggcccca accaggtttc tgaagataat 1080 

tttcatattc ctgagaaacc ggaccctrtia caagccttca caaaactgtc aacaacaatg 1140 

gcagcactaa gagacctaca atcataacgt t::acaatgca gccracngga ttgcctctag 1200 

atctgttttt cttaaacact aacagaacaa ctctttataa atiaggnaagc cttacacctg 1260 

ttaaagaaac ttacctctaa tttcagtczc acraatgtaa aacactggga ctcaagcaca 1320 

caacccagcc actaactgra cagccttatg tggggaacaa tccatgcagg ctaccggaaa 1380 

attaaatctc atcaccaact ccccgtgaca rctttgccat caccaccaca tgagcaagat 1440 

gatgttttgc agcattcccc attgcngaca caaatggaga gggcagagaa gaccttatac 1500 

aaccagtctt cccactgcag agtcttaaga aagattacca gacgacctac ctatacgact 1560 

aatgccacca ggaactcaga ggtacgaata gggggztgtc catccctccc ccatactgag 1620 

gtggagatgc ccatgcaata cctctaagga cgcacggtcc agcccccagt tacccttcac 1680 

tgctctcggt gaaggtatgt gggagaaaaa ccaattataa tacgtttccc agcctctgat 1740 

ggagaaggaa caccattcrg ataccagaac atggttaaca aggaaaagag aaaaatcccc 1800 

aaccaatctt aattgaacca agcccgaaac caacggaaaa aaaaaatggg tagcgtatat 1860 

tctgcaggct caagacaacc caggacaata aaaacaatgg accttacatg tgtatacata 1920 

ragctccctt aggcaccaca atcagcacga gccaacaata ccnaaacctg a^tcaggcca 1980 
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cattcagaca tctgctctca cacacaaaca cttaaatnaa atacaatctg aaatgtgttc 2040 

tgttacatac aaaaaaggaa aaactataca acgcagagca gtgtgtgtgt tttaaacaat 2100 

tacatttaca tgtaagctaa acggaaccag caacggtgcc caagctttta tcatccctcc 2160 

cagaaaatcc tttcccacca tctcttctac tttctgcccg gctttgctgg aacacggctt 2220 

gtggttctcc agtttcatgt ccccatragg gaaggcatct gagT:agagga taggaccccc 2280 

tgagtgtcct ccacatcggc ttgtgacttt gccgtcgaag acttgaccga gcacatcgaa 2340 

gaacggcagg agctgctcca tactgcgcac ggtgcagacg gtgagcagca agcgccctgg 2400 

ctcccaaccc aacgttctcc ccgagtcgcc tccccctgga tttttctgca gaaaacaaaa 2460 

agtgaactgg taccaataca acagacaacg cggtatgtca gaaaaattaa aaatacataa 2520 

actttggcaa tcggtcaaga aacgaacaca aatgacacta agtttctaac tcctgacccg 2580 

atcaaaaccc ttggtgcctc cgagaccctt tactgccacc cattagtttc acacggagca 2640 

gtctaacatt gtagtaacag ttcccaacta gaacgcgcag ataagcttag ttaacagaaa 2700 

tagctttgaa caggaacaga gtcaaacata aaagttttat gttgtgcttt gtattcactc 2760 

aaaaagctcc caggtttctig aaccctcacc actgtaacca aggactaggt cacaaaatta 2820 

ctacagaaaa aaggaacaaa gtgcrtcata catcccataa catatcccct tttattataa 2880 

ttagtuaatt cccttttatc taaacggcct aaatttgcca tgatggtagc agtgtccaaa 2940 

gtgaataatt actgtcagca ctgcaccaca gagaaaggaa gggatccctc aggagacact 3000 

gctgtctcct tctgggttgt gccaaacaac atagggagga aagctggacc cggagtcaaa 3060 

ggaattgagt cagtgtgccg gctictgccat acttacggca cccttgggca ggatatacaa 3120 

aggttcctca ctcataaaat gggacagtcc aaaactacct tttagtagag aagtcaaatg 3180 

agaaggtatg tgaaaactct gtcaaccaaa tataaagact aacaacctgg gtattaagag 3240 

gccagttcga gaagccaccc gaactacaca aacacagcna cagacatcat tcngtccaga 3300 

gaaagacaag agagaacagg ttggctgaac ctgggcagaa ncacagatac aattccacac 3360 

taaagaacga aaacaagcaa tgaaccagac agaaggaaga aaccatgaag acttaggaag 3420 

cagaattaca atctgtcata ttaacaaatg gagtttgcct: cctaagatca gacgttgctc 3480 

agaaactttc actgtttacc taacaattita ataccactag crtcctagtg ggtcaagcag 3540 

atgcaaaatc cagcttattt: tcttccacgt gctctcaagc ctactgctta ttttaaagca 3600 

aaatcctgaa aaaggaaaat actaggttgg cgcaaacgna artgcggctt ttgcattgtt 3660 

gaaatttgcc. gttttatatt ggagcacatt cctaaacaaa cgcggtcatg ctatacaaaa 3720 

aaaaaaaaaa aaaactcgag 3740 



<210> 15 
<211> 1196 
<212> DNA 

<213> Homo sapiens 



<400> 15 

ccacgcgtcc ggcgaaccrg gatgatccat ctatgtca^c tarctacgtg "cccttctica 60 

tgccagcacc ccaccgcccc gatcactgcg gcrttatggc gtcztigaaat caagtattgt 120 

taagctttgc aatctctccc aagattgtct ::ggccatagg wccctitcttt ataaagtcta 180 

tattaagctt ctccatttrt ctcraagaaa aaatctgccg gaatggcact gaartcctag 240 

gtcgatttgg gaagaactga caacatr.gaa cccttcaacc garggacatg gcatgnttct 300 

gcatttattt gggccttcta aaattcatcc cagcaatact ttgtagtttt agtgaagaag 3.60 

tctcgtatat atctttcgtc aaacgcatcc ctiaaataccc cgtggaaacg Cwaccgtaaa 420 

tagcactcta atctcac-tt taagtittatic gccggcacac agaaacatgc- cttatttrtg 480 

taatactgac ttctatat'c tgtaacctta cccaaactca ctcagcccag tag!:agttct 540 

tatttctatc ttatcgcaag cgcctraggg atacctcgcg cacacaatcc acgtcacata 600 

cgaataacca acaggtttac cttLtgttzc ttcrtgcCLt gccgtaatgg ctaggacctc 660 

ccaacctaaa cggtcgaaca gaagcgg-gc gtgagctcac "acccttgcc ctactctcaa 720' 

cagcagagga tagcgtccag tctrtcacta ciaaaacacg ttagccatag gtttttitccg 730 

ttccatgctt ccatcagatt gaagarccct tttatrccta t tntigccaag agcgccacga 840 

atgcatgtiig aatttcacca agcacttc't cccgcatccg ctgagagaat ctcgactctt 900 

ctctcctatt titgccaangc agactcctta acgttaaacc gaccttgcat ccctgatata 960 

aaccatactt agtcatgata cgttaccccc ctcacacaac gctggatticg grttgcaaat 1020 

attctgtctg atttttgcat ctatgctcac gagacagact ggtccgtgac tttcctttct ' 1080 

tgtaagcctt gccaggtgtc agggcctggt caattttgac acattcacaa acatatacrt 1140 

ttaagacaaa aacaaacttc aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 1196 
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<210> 16 
<211> 2209 
<212> DNA 

<213> Homo sapiens 



<400> 16 

gagctgcgcc gggcccgggc gcctggggcc gccgctcccc accgtcgtcc cccccaccga 60 

ggccgaggcg tcccggagcc acggccggcc cgaaccgcgg ggcctccatc gcactgccag 120 

gggttctgct gctgggtgcg gcgcgcccgc cgcgcggggc agaagccttt gagattgctc 180 

tgccacgaga aagcaacatt acagttccca taaagctggg gaccccgact ctgctggcaa 240 

aaccctgtca catcgtcatt cccaaaagac atataaccat gccgtccacc aagcctggag 300 

aaagaatagt ccttaccLtt agccgccaga gccctgagaa ccactttgtc atagagaccc 360 

agaaaaacat cgactgtacg tcaggcccat gtccctttgg ggaggttcag cttcagccct 420 

cgacatcgtt gttgcctacc ctcaacagaa ccctcatctg ggacgtcaaa gcccacaaga 480 

gcatcggttt agagccgcag ttttccatcc cccgcccgag gcagaccggt ccgggtgaga 540 

gctgcccaga cggagccact cactccatca gcggccgaat cgatgccacc gtggtcagga 600 

tcggaacctt ccgcagcaac ggcaccgcgc cccggaccaa gacgcawgaa ggagtgaaaa 660 

tggccttaca ccLcccacgg ttccacccca gaaatgtctc cggcrticagc actgcaaacc 720 

gctcatccac aaaacgcctg tgcatcaccg agtccgtgcr cgagggcgaa ggctcagcaa 780 

ccctgatgtc cgccaaccac ccagaaggct cccccgagga cgagcccatg acgtggcagc 840 

ccgtcgtLcc cgcacacccg cgggccagcg cctccctccc caactccaac ctictccaact 900 

gtragaggaa ggaggagcgg gccgaacact acatcccggg ctccaccacc aaccccgagg 960 

tgtccaagct ggaggacaag cagcccggga acacggcggg gaactccaac ctctctctgc 1020 

aaggctgcga ccaagacgcc caaagtccag ggaccctccg gczgcagzzc caagtttcgg 1080* 

tccaacatcc acaaaatgaa agcaataaaa tccacgtggt igacctgagc aacgagcgag 1140 

ccatgtcact caccatcgag ccacggcccg tcaaacagag ccgcaagttc gcccccggct '1200 

gtttcgtgtg tctagaatct cggacctgca gtagcaacct: caccctgaca tccggctcca 1260 

aacacaaaat ctccttccct tgcgacgatc tgacacgtct gtggacgaac grggaaaaam 132.0 

ccataagytg cacagaccac cggtactgcc aaaggaaatc ctactcacty caggcgccca 1380 

gtgacatccc ycamctgcct gtggagccgc acgacctctc ccggaagccg ccggtgccca 1440 

aggacaggct cagcccggtg ccggtgccag cccagaagct gcagcagcat acacacgaga 1500 

agccctgcaa caccagcttc agctaccccg ^ggccagcgc cacacccagc caggacctgt 1560 

acttcggctc cttctgcccg. ggaggCwCta "caagcagac ccaggcgaag cagaacaccc 1620 

cggtgaccct tcgcacccct gcccccagct tccgacaaga ggccrccagg cagggtccga 1680 

cggtgccctt tacaccttat ttcaaagagg aaggcgctr.r cacggtgacc cczgacacaa 1740 

aaagcaaggt ctacctgagg acccccaacc gggaccgggg cccgccaccc ctcacctctg 1800 

cgtcctggaa caccagcgtg cccagagacc aggtggcccg cczgaczzzc t:r;T:aaggagc 1860 

ggagcggcgc ggrctgccag acagggcgcg carccatgat cacccaggag cagcggaccc 1920 

gggctgagga gatcttcagc ccggacgagg acgcgctccc caagccaagc crccaccatc 1980. 

acagcttctg ggtcaacatc tycaaycgma gccccacgag cggcaagcag ctagacctgc 2040 

tcttctcggt gacacctacc ccaaggactg zggactcgac cgtcaccccc atcgcagcgg 2100 

cgggaggtgg agcctcaccg czgzczgccc rcgggcrcat catccgccgc gcgaaaaaaa 2160 

aaaaaarama aacaaggggc cccgczgzgg gcatctacaa cggcaacac 2209 



<210> 17 
<211> 1774 
<212> DNA 

<213> Homo sapiens 



<400> 17 

caggcacccc aggctcttac accccawgct: ccggcccgca cgccgcgtga actgtaaccg 60 

gacaccaacc tcacacaagg aaccagccca cgaccacgac racgccaagc ncgaaaccaa 120 

cccccactaa agggaacaaa agctggagct ccaccgcggt ggcggccgcr ccagaactag 180 

tggacccccc gggctgcagg aatccggcac gagcggctgc gggcgcgagg cgaggggcgc 240 

gaggctccca gcaggacgcc ccggccctgc aggaagccga agcgagaggc ccggagaggg 300 

cccagcccgc ccggggcagg atgaccaagg cccggctgct ccggctgtgg ctggtgccgg 360 

ggtcggtgtt: cacgaccccg ctgaccaccg cgcacrggga cagcgcaggc gccgcgcact 420 
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tctacttgca cacgcccttc tctaggccgc acacggggcc gccgcngccc acgcccgggc 480 

cggacaggga cagggagccc acggccgacc ccgatgtcga cgagcctctg gacaagcttc 540 

tcagtgctgg cgtgaagcag agtgaccctc ccagaaagga gacggagcag ccgcctgcgc 600 

cggggagcac ggaggagaac gcgagaggcc acgaccggtc cccgcgcgac gcccggcgca 660 

gcccagacca gggccggcag. caggcggagc ggaggagcgt gccgcggggc ttccgcgcca 720 

accccagccc ggcctccccc accaaggagc gcgcattcga cgacaccccc aactcggagc 780 

tgagccacct gatcgtggac gaccggcacg gggccatcta ctgctacgcg cccaaggtgg 840 

cctgcaccaa ccggaagcgc gtgacgatcg tgctgagcgg aagcccgctg caccgcggtg 900 

cgccctaccg cgacccgccg cgcatcccgc gcgagcacgt gcacaacgcc agcgcgcacc 960 

tgaccctcaa caagctctgg cgccgctacg ggaagctctc ccgccacccc atgaaggtca 1020 

agctcaagaa gtacaccaag ttccccttcg tgcgcgaccc cttcgtgcgc ctgatctccg 1080 

ccttccgcag caagtccgag ctggagaacg aggagttcca ccgcaagtcc gccgtgccca 1140 

tgctgcggct gcacgccaac cacaccagcc tgcccgcctc ggcgcgcgag gccttccgcg 1200 

ctggcctcaa ggtgtccttc gccaacttca tccagcaccc gctggacccg cacacggaga 1260 

agctggcgcc cttcaacgag cactggcggc aggtgcaccg cctctgccac ccgtgccaga 1320 

tcgactacga cttcgtgggg aagctggaga ctctggacga ggacgccgcg cagctgctgc 1380 

agctactcca ggtggaccgg cagctccgct tccccccgag ctaccggaac aggaccgcca 1440 

gcagctggga ggaggactgg ttcgccaaga tccccctggc ctggaggcag cagctgtata 1500 

aactccacga ggccgacttt gttctctccg gctaccccaa gcccgaaaac ctcctccgag 1560 

actgaaagcc ctcgcgttgc tctLtctcgc gtgcccggaa cctgacgcac gcgcactcca 1620 

gtttttctac gacctacgac cttgcaatcc gggcttcttg tccactccac cgcctctacc 1680 

catcgagtac tgtatcgaca ccgctttcca agactaacat acctcaggna tccaatacga 1740 

aaaaaaaaaa aaaaaaaccc gaggggggyc- ccgg 1774 



<210> 18 

<211> 1674 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (1649) 

<223> n equals a,c,g, or c 
<220> 

<221> SITE 
<222> (1663) 

<223> n equals a,t,g. or c 
<22d> 

<221> SITE 
<222> (1665) 

<223> n equals a,t.g, or c 



<400> 18 

caagttggta cgcctgcagg caccggcccg gaattcccgg gccgacccac gcgtccggtc 60 

gaagataggc tcgagacggc ggatgttgca gctgatcacg cagrcgggcc cggcgctgct 120 

cacacgccgc cccttttggg gccgcntcag ccagcrcacg ctgtiacgctg agagggctga 180 

ggcacgccgg aagcccgaca tcccagcgcc c::acctgtat tccgacaigg gggcagccgt 240 

gctgtgcgct agcttcatgc ccztzqgcgz gaagcggcgc cggtrcgcgc cgggggccgc 300 

actccaattg gccattagca cccacgccgc ctacaccggg ggctacgtcc accacgggga 360 

ccggctgaag gcccgcatgc acccgcgcac agttgccatc atcggcggct ttcctgtgtt 420 

ggccagcggt gcnggggagc tgtaccgccg gaaaccccgc agccgccccc tgcagtccac 480 

cggccaggtg ctcctgggta tctaccccat ctgtgcggcc tactcactgc agcacagcaa 540 

ggaggaccgg ctggcgcacc cgaaccatct cccaggaggg gagccgatga cccagctgtc 600 

cctcgtgctg tacggcaccc cggccctggc ctctctgcca ggccaccacg tgaccctcgc 660 

. cgcccagatc ctggctgtac tgczgccccc cgtcatgctg ctcattgatg gcaatgttgc 720 

tcactggcac aacacgcggc gcgtcgagtt: ccggaaccag atgaagcccc ccggagagag ' 780 
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cgcgggcacc 
ggctgagatg 
ctgctgttta 
Cttcgcagtt: 
ggagccctgg 
cccactgagc 
agtgggagag 
attccccrtc 
agggagtgtg 
tacagaagtc 
cactcaccct 
gcagtggcca 
ggggtggagc 
tagamaaaaa 
ttacgcgcgc 



tccggaaccg ccgccaccct: 
ggcacaggga gccaccgagg 
tccacgccct tuggcccgrt 
aagaggcagc ccatctgccc 
cactaatgct gcacaggtcc 
ctcctgcccc ngagaaggga 
tgggagacag aggaaggaag 
gaacccggca gatgcagcta 
tgcgttgaca catgcggacc 
acatgggctc ccagggcacg 
gagagtagag. cagacccrgc 
gctttgcccc tcccgccgcc 
tccctcccaa acaccagacc 
aaaaaaaaaa aagggcggcc 
atgcgacgtc atagccctint 



ggccaccgac 
gccaccctgc 
tgttcgacct 
aaactcctgg 
cctrcctgtc 
gtacggcagg 
acggagactg 
ggctccgcag 
aggcccagga 
ccaggggcag 
cctgctctgg 
cccgtttcta 
acacagtcct 
gctctagagg 
cccuacagaa 



ggcrgagttt tatggcaaga 
cctcctcctt gctggcccag 
tttgctcttt caaaactgtt 
gcncagcgcc cgggagggca 
aggagagctg aggccagccg 
gccgggatgc ggctactgag 
gaagtgagca aatgtgaaaa 
cgctgtttgg agaccgcgag 
agggcacagg ggccgagcac 
aaacagtacc ggctccctgt 
gctgtgaagg ggtggagcag 
gccccacggt tggcctggtg 
ccaaaaataa acactctaca 
acccctcgag gggcccaagc 
gcngnaaagg gtcc 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1674 



<210> 19 
<211> 2018 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (2010) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 

<222> (2012) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (2014) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (2016) 

<223> n equals a, eg, or c 



<400> 19 



cagcaacgaa 


atcctgctrc 


cttttcccca 


gaaccacca" 


arccagtggc 


taaacggctc 


60 


cccgattcat 


ggtttgtgga 


atczzgczzc 


ccttcttccc 


aacctccgyt 


tatctgcatt 


120 


gacgccctcc 


gccctctcct 


cccnggaacc 


agaaggczzz 


gcrggcctga 


aaaagggaac 


180 


ccgagcccgc 


atttcagaga 


ccccggzcac 


gcctcc'czn 


ccrgcgttac 


tcactctcgg 


240 


gacagtgtgg 


gtagcttcag 


cacncaccga 


caacgargcc 


gcaagcatgg 


aarctttaca 


300 


tgatctccgg 


gagctccatc 


taccccaccc 


aiarccccgt 


acaccatcga 


tgggatgttt 


360 


gtcacttctc 


crgtgcacac 


cagttggccr 


crctcgcacg 


tccacagcga 


tgggtcacct 


420 


gctagcgaag 


ccaacaacr.c 


ctgaagaccr 


ggacgaacaa 


acccacatca 


craccccaga 


480 


ggaagaagca 


ctccagagac 


gaccaaacgg 


gczgtczzca 


ccggcggaat 


acaacataat 


540 


ggagccggaa 


caagaactitg 


aaaarg-aaa 


gacrcrcaag 


dcaaaactag 


agaggcgaaa 


600 


aaaggcctca 


gcatgggaaa 


gaaattrggc 


gtatcccgcc 


gtcacggctc 


tcccrcccat 


660 


tgagacaccc 


atctcggtcc 


tctcggcggc 


tcgtaacatt 


ctctgcccac 


tggt cgacga 


720 


aacagcaatg 


ccaaaaggaa 


caagggggcc 


iggaacagga 


aatgcctccc 


cccccacgtc 


780 


tggtttcgtg 


ggagctgcgc 


tcgaaatcac 


twCgacttcc 


caccctacgg 


cgrcccctgt 


840 


tgtcggcttc 


catagcctcc 


gattntttgg 


aaacrtcacc 


cccaagaaag 


atigacacaac 


900 


tatgacaaag 


accattggaa 


actgcgtgcc 


caLcctggtt 


ctgagctctg 


ctccgcccgc 


960 
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gatgtcgaga acaictgggaa ccaCwagatc cgacctactr ggcgacctcg gaaggtittaa 1020 

tcggctggga aatttccata ttgtatcacc ctacaatttg ctctttgcca ttgcgacaac 1080 

actgtgtctg gcccgaaaat tcacctctgc agcccgagaa gaacctttca aggccctagg 1140 

gcctcataaa cct:cacctac caaacacccc aagggatcca gaaacagcca agcctcctgt 1200 

aaatgggcat cagaaagcac cgtgagacgc acagacggcg tcctccgcca ccaagagacc 1260 

cgagaactcc agattcacga catccccgtc ccacgcagaa gcatt.tccat tcaaccgtgg 1320 

cccctcttca gaacctagac ctaccagcgc catccccctc tcataatcca cgaagaactt 1380 

ggctatggcc gatcttttcc aaac-caact ctctgacgga ccccgcagtt cccagttaag 1440 

tgcagattcc ctacagacat atagaacagc gcattcttct gtagacattt gctcatgtcg 1500 

gcaaacacaa tcacccacat gaaaaaactg tcttcacccg acatgaaaat gttagaaaag 1560 

gcaaaccccg ggacttccaa agatttacct aaaccccatt atgtacctta ttcagaacgt 1620 

agaagctgac trgaaaggca tccttggtac caagtgaagc ttattcagaa aatgcattct 1680 

t'caaatgcaa tggcaactgc ttgtagacat catttttgca gtgtargttg gagctgtaat 1740 

ggttgcaatt atgtttctca tctccccaaa agcaaaaagc gcagtttctg acctatgtta' 1800 

tagaatgata ctgatcagac tttgagccaa ggggaaaata ctaaattcct ttaaacctgg 1860 

agccttagag agccacagga atatctcctg ctgtacagtc taataagctg cggtaggaag 1920 

tatcatgtaa ccacagttca atgacagttc acgtacatac ataactcagt acticcctcga 1980 

g'ggggggccc ggcacccaat tcgcccaagn tnancnag 2018 



<210> 20 
<211> 2098 
<2i2> UNA 

<2i3> Homo sapiens 



<400> 20 

ggcacgaggg accgagctat tctcciggga ccggccatga tggrgcgccc ca::catgatg 60 

tattttctgc tgggaatcac actcctgcgc tcacacacgc agagcgtgcg gaccgaagag 120 

tcccaatgca ccttgctgaa tgcgtccatc acggaaacat ttaatcgccc cctcagctgt 180 

ggtccagacc gctggaaact ttctcagcac ccctgcctcc aggcgracgt taacctgact 240 

tcctccgggg aaaagctcct cctctaccac acagaagaga caataaaaac caatcagaag 300 

tgctcctata tacctaaacg tggaaaaaac tttgaagaat ccatgzccct ggcgaatgct 360 

gtcacggaaa acttcaggaa gtatcaacac ccctcccgcc atcctgaccc agaaggaaac 420 

cagaagagcg tcatcctaac aaaaczctac agttccaacg cgcrgtccca trcactcttc 480 

tggccaacct gtatgacggc ngggggrgtg gcaattgccg ccatggcgaa acttacacag 540 

tacctctccc cactatgtga gaggacccaa cggaccaaca cataaarigca aaaatggata 600 

aaataatcct tgttaaagcc caaacactigc rrtcrtcca- tictrcaccaa agaaccrtaa 660 

gtttgcaacg tgcagtccgt tacgagrccc cLaacacacc cr::acacgra gagcaataac 720 

gcaaaagccg ccctacacgc aaacaccacg cdttacrat: tcaggagaac aaacaaccgt 780 

tttgtgtcgg tcggtggttc tcataacc'C azitccgtac cggaac-agt acciccctcc 840 

ctcattccgc caaaacaggg c::cagc^a-t: cacctgccaa gcctcgtgga ggaacgtagg 900 

tgacatcaac gtgataaagt ctgtgttctg agctgtcaga ccccr."gaag acaacattct 960 

tcatcaccta tcgcctacta aagccacagc caaaaacacr tttcttrcct attctaaact 1020 

gagccctata gcaagtgaag ggaccagart rcctaaccaa aggaagccag gtacctttct 1080 

tgtatctttt accatatcac tgcaaagaag aggggaaacc cagccagcta ctttttttca 1140 

tcacttttta ttcataactt cagacttgca aaactaatcc ccaaaacata agctgctttc 1200 

atcagccagt Cctataatat cztcctgzga t^tacgtaga aaacgaacac accccttttc 1260 

cattcaagac cctgctactg tgcgaagaga tgatacttac aaggagtgtc actacctgtg 1320 

agctgactga atgtcggtag gtgctccaci acaacccagg aaagcctgtg Cwactgatac 1380 

ttgtgcggaa acccttactt cact::caawt raaccaccag acggcaaaat taagatgc::a 1440 

ctcgttggca aaaatcggt-g gactggcccc aacgggtaaa tgcgccgtgg caaactaatg 1500 

tgttggaata ttgctctccg cgaacttigcg cczaagtcaa cgaacgcgta gcatctcctt 1560 

ctgacaagca ctccctattg ggattctaaa gctacgtgca cagaacatta gtctcttcta 1620 

cacgttttac ccttccacct acaaccccct ctcctgctgc tacact:" tac acacagaaca 1680 

gatccctttc ctaacacaca ctcgaaccga acaacagact caaagaaagc cctcgttcac 1740 

atcgctatct acttctgngt ccgggggaaa atacgaggga ctgactttaa ataaaaaaca 1800 

tcccatctct catttaacac caataccaaa agaagaagac aaacatccat ctttctcatc 1860 

tatatttaag taccttt-cg caa-gcag^a ccaaagcttt ttaggcaacg caaaacctra 1920 

caaatcactt gtggaatgaa tggtaaaacc aatctgatga aacggaaaat tatrccgcaa 1980 
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tatcgcaatt catagtticga cctttcacaa gcaaacaaac ccctaggatg laatcaggac 2040 
ttcaaacgtg caactaaatt ttcctaaaaa aaacctaaaa aaaaaaaaaa aaaaaaaa 2098 



<210> 21 
<211> 1746 
<212> DNA 

<213> Homo sapiens 



<400> 21 

gcgttcgcgc acctccagct cgggccgacg tggaagcttt ggagagctga agagggcgcg 60 

gcggcgctcg gcggcgcgct cctcctgccg ctctccgcgc caggggcccg ccagctgctg 120 

aagcagaggc ggccgacggg cctccccccg- gggccgccgg ggctgccatt caccggcaac " 180, 

atctactccc tggcagcctc atccgagctt ccccatgtct acacgagaaa gcagagccag 240 

gtgtacggag aggtacagcc ccgacgggcc ccgggcaggg agggccgcca ggccggcccg 300 

ggccggccag ggcctccctg gctggactca cggccgcccc tgggccgact agccgggacc 360 

tctccgtgcg ccggctgccc tctgagggac acccgcctcc cgggtctgga agggagaagt 420 

ccccgacgcc gcgccccctt gcagggggag ccccgcccct gccggtgacc cactccgggc 480 

cgaggctccg aggcgaccca gtcctgatcc tcccgccacc gctcgagctc ccgctcctgc 540 

gcctgcgccg cccggcccgc cagccgcgcc gccaccccag gcccagggcg gacgcatgcc 600 

ctcaggtgcg ggcgtcttgc gagccggcci: cgcagctictg tggaagctgc acgcggcttg 660 

tcggaaaacc aaggcgttcc gagztccaga tggtitaatag caggccctcc ggtgcctgca 720 

grcgacgaac gactggcgta ggcgctcgcr gtgagaatgg agaatgcagg ggaacgcccc 780 

tgactgagaa gcgggccccg ggaaacgatt gtgaacgcgt gaacgaaccg acgactaaaa 840 

cccgctgcgg gggtcctaca gcgcagatgg taatgccgtt ccgaccggct: gggaacggca 900 

ccttagcaga tacctaaaag gcgccttctg tgcgccactg tcaccgccaa cttggtgact 960 

catttaaaac tcacaaccag ccggtgaggt cggtacttcg cccctcccca tcctgcggag 1020 

gggaaagcag cacggaaatg ccctgtgact ggcagcggaa aaggcgacca ccgcttgtgt 1080 

gtgggtgccc cgacgtccgg agggggcagg agtctccacg ggtcctggga cagagctcac 1140 

ctgttittgct ttgaatcaca cttatttaca cgcaactaca ggcctgacgc cagcggtgaa 1200 

gaaggcagat acagcctttt aaggagttgg cagatgagtg ggagagagaa aactaatctc 1260 

attatcggcc acaggctgtg gccagcgttt cgaaggaaaa gtacagggat cntcggcaac 1,320 

tgtggtattt caggcttgac cttaaacccc cactcaaacc agcttttaca aggattggtc 1380 

taggtgcccg ggcgcggtgc tcacgcctac aaccccagca ctccgggagg ccgaggcggg 1440 

cggaccacga aatcaggaga tcgagaccgt cgtggccaac acggcgaaac cccatctcta 1500 

ctaaaagaat acaaaaaatt ggccgggcgt ggcggcgggc acctgcgccc ccagctattc 1560 

g^gaggctgg ggcaggagag cggcgcgaac ccgggaggcg gagcc::tcag zgagccgaga 1620 

tcgcgccact gcactccagc ccgggcaaca gagccagacc ccgtcrcaaa aaaaaaaaaa 1680 

aaaaagggcg gccgcrctag aggacccaag ccracgcacg cguocacgcg acgccaatag 1740 

ctcttc 1746 



<210> 22 
<211> 2876 
<212> DNA 

<213> Homo sapiens 



<400> 22 

ccacgcgccc ggcggcgagg cggcggtggt: ggctgagccc gtggtggcag aggcgaaggc 50 

gacagctcta ggggttggca ccggccccga gaggaggatg cgggcccgga tiagggccgac 120 

gctgctgctg cgngcggcgc cgccgagctt ggcctcggcg cccccggacg aagaaggcag 180 

ccaggacgaa tcctitagact ccaagactac tctgacacca gatgagccag taaaggacca 240 

tactaccgca ggcagagcag tcgctggtca aacactccti: gatccagaag aacccgaat:: 300 

agaatcctct atccaagaag aggaagacag ccccaagagc caagaggggg aaagngtcac 360 

agaagatacc agctctccag agcctccaaa tccagaaaac aaggactacg aagagccaaa 420 

gaaagtacgg aaaccaggca gcccggacac cttccttgct tttcgattca cctaggggac 480 

aactgaaaar rtcaagctaa tgaacaaaga ggccgaagaa gaccggcttc accgattatt 540 
acccacaaat aatatatgga gtgcagttcg gagagaattt ctagatttca acataccaaa . 600 

gtcattcacc caacagaccc gacccaagtt acataagccg aatcccacag acgagatcgc 660 
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atttatctct ccattaacac cccgacgrcc ccctctggaa ccaagtatgt aaacaccctt: 720 

ctcactgaga aacttaaaaa aataatacgc cagggcagca cacaaaacag aaacccagga 780 

agaactgccc aacatgccac tatggttac- ttcaggatcg cggtaggacc actaccacaa 840 

taaccttgtc gatcatccta ggtaacccga tittaggtgtt tacccctcag ccgacgcccg 900 

tgtgtacctc catcctactc caaccrccac aaccctgaaa cctctacttc tcaattcaaa 960 

acttgacctc tttcacaata gtcttccaaa tatcaaccct cctaaaaact atcaatgtat 1020 

gcattctacc atgtcttgtc gccctitctgt caictattat ttcaatgacc actgccctac 1080 

ctgtttatta ctccaccact aaccactccc aggaaccgat ccaaattgga actctttaaa 1140 

aaaatagaat ttcctttcac cacaaataca ccactctcat gaacagaaaa ccactcttaa 1200 

aaaattaaaa gtacatgaat cccaccaact agaaataaca gccgctcaca ccacggagaa 1260 

tattctctca ggattctttc aggtgtacca gcagctttiaa gaaaagtaaa tcgggtcaaa 1320 

atgtgtattc tatgctgcag cccgcaccct gcacctaaca gtttaccacg aatgtttttc 1380 

catgtcactt agcttatttt caacttgaca gccaatggct acaaaaaatt ttatttgtaa 1440 

atatatggta ccataaacca aaacgtccat gctttgctgg gagatcattt tagatgcatc 1500 

tttgtgccta tctgtgaaaa tttcctcaga actctcaaag taaatttgcc gaactggatg 1560 

taaaagttta aaagacagta gacacacatc gt:::aaatcgc ccttcagaaa agtgattccg 1620 

tttacctcct ctgaaaatac ctcttccccr gcacctcacc caagcgtgta ttaccatctt 1680 

aaaattttta ctaaatacaa tccaactcac zczzcaczzz tcaattaggg actttttctt 1740 

tcttttttta gttttcatat ttttagagac acggtcrcac tctgtcaccc aiggctggagc 1800 

gcagcggcac gatcacagtt cacrgcagcc ccgaacccag gtgacctccc tctctggccc 1860 

cccaaagtgc tgggactaca ggcacgagac actacaccaa gcctgggaaa tcttcaaaca 1920 

cgcataaaag tagagggtaa aacgacaccc zzzaccaacc czcaczcagc ttcaacaaac 1980 

accaacactc cgccggtcCC tzcatczzza czaccczaca cacacctgcc ctt t t t ct 2040 

ttcctagaat attttaaggc aaaccccgca rgcgctcttia cacctccgrg cctctctcta 2100 

gtaagaatgt tctttccagc aaaaccacag aactagccac tactccttca cnacctaaca 2160 

cccagaccgt tcaattttcc cagttgcttc aagcgcccgt ttacaattca cttgtctgaa 2220 

ttcatttcca agctactcag gaggctgagc caggagaatt gcccgaaccc aagaggcaga 2280 

ggtcgcagcg agctgagatc gtgccactgt actccagccr gggcgacaaa gtgagacccc 2340 

atgtcaaaac aaaacaaaac aaaacaaaac aaaacaaaac aaacaacaca attgaataat 2400 

atccctcagt tattatttag agaagagacg actcaatata aatgcaacac gtgatcatct 2460 

ggtcacagaa tgaatttcag tcattgctca ggcaagcccg gcaaaaaaaa aaaaaaacgg 2520- 

gactccatga ttagtgtgtt gaaaaggggg tcgccaaagc gcaacagaca gagccgggtg 2 580 

cagcggctca cgcccgtaat cccagaaCwC tgggaggcca aggcaggcgg atcacgaggt 2640 

caagagactg agacaatccc ggccaacatg gigaagccct gtawCaacca aaaacacaaa 2700 

aacaattagc cgggcgcggt gacatgtgcc tgcagcccca gctactcagg aggctgaggc 2760 

aggagaattg cttgaacctg ggaggcggag gcigcagcga gctgagatcg ctccactgca 2820 

ctccagcctg gccacagagc gagactccat cccaaaaaaa aaaaaaaaaa aaaaaa 2 876 



<210> 23 

<211> 1052 

<212> DNA 

<213> Homo sapiens 



j <400> 23 

j • tcgacccacg cgtccgccca cgcgtccgcc cacgcgcccg ggcgggcgca ggacgtgcac 60 

1 cacggctcgg ggctcgccgc gccggttgcc gcggcrcctc g-gccggggc tccggccggc 120 

j gttgctgcgc tccctggccg gggagcaagc gccaggcacc gccccctgcZ cccgcggcag 180 

I ctcctggagc gcggacctgg acaagcgcar ggaccgcagc accccctgcc cccttccggc 240 

; tgctttggcc catccttggg ggcgccctga gcczgacczz cgtgctgggg ccgcttcctg 300 

j gctccttggt ctggagacga tgccgcagag agagaagtcc accaccccca tagaggagac 360 

cggcggagag ggctigcccag ccgcggcgcr gacccagcga caatgtgccc cccgccagcc 420 

! ggggctcgcc cactcatcac tcatccaccc accciagagc cagLctctgc ctcccagacg 480 

; cggcgggagc caagcccctc caaccacaag gggggcgcgg ggcggcgaac cacctctgag 540 

; gcccgggccc agggcccagg ggaaccctcc aaggtgcctg gtcgccctgc czczggczcc 600 

j agaacagaaa gggagcctca cgctggccca cacaaaacag ccgacactga craaggaact 660 

j gcagcatttg cacaggggag ggggcgccct ccttcccaga ggccccgggg gccaggctga 720 

! cttggggggc agacttgaca ctaggcccca cicacccaga cgccctgaaa ttccaccacg 780 

i ggggtcaccc tggggggtta gggacctatt ttiaacacca gggggctggc ccactaggag 840 
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ggctggcccc aagacacaga cccccccaac cccccaaagc ggggaggaga cactcactct 900 

ggggagagtt cggaggggag ggcggggggg gggaaaaaaa aataaaaaaa aaaactttta 960 

atcctcaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1020 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 1052 



<210> 24 

<211> 1541 

<2i2> DNA 

<213> Homo sapiens 



<400> 24 

ggcacgagct tcctcgatcc tactctgtct crccttgttt tggagaagtt caatttacca 60 

gctggatatg tgggaccagt attcccgggc atggcactgt cctatgccat ctcttcacca 120 

ctatctggtc ccctaagtga caaaaggcca cctctaagga aacggcttct ggtgtctggc 180 

aacttaatca cagccgggtg ctacacgctc icagggcctg tcccaatcct gcacactaaa 240 

agtcagctct ggctgctggt gctgatacta gttgtaagcg gcctctctgc tggaatgagt 300 

ataattccaa ctttcccgga aattcccagt: rgrgcacacg aaaacgggtt tgaagaggga 360 

ttaagcacac tgggactcgt atcaggrcrz tttagrgcaa cgcggtcaat tggcgctctt 420 

atgggaccaa cgctgggcgg acttctgtar gagaaaacrg gcr-t:gaatg ggcagcagct 480 

atacaaggcc tatgggctcr gacaagcgga ctagccangg gct-gcttca cctaccggag • 540 

tattcaagga gaaaaaggtc caaacctcaa aacazcczca gcacagagga ggaacgaact 600 

actctcctgc ctaatgaaac ccagtccgac cgatcccgga crgacacaag gtcgagaaac 660 

gaatgcccct ggcctcaaac aticaccgcag gaagggcrcr caaaacctta cgcgcaaaac 720 

tccgtggacc .ccgtgccagt gtctcggaag cgtcaacgcg ctcttggacg accctgtatt 780 

gggctgtact tactgtgaca ctgaaaagct grcctgccga agcagccaca cccgaaacat 840 

taagtatgaa aggagtaatt aaaaacaagc aaaacaaaac aagacc^agt tcctaaatga 9D0- 

ccaaactcgt- ccctaaagat gctgttatra acccgagtra gctcrcantt cccctgttca 960 

ttttttattc taagtacact gattcrgtga acgtaccctt tciartacac agggaaagaa 1020 

atgaactaat ttgatatgct ctaaaracat aaaggtgczc caaaatatgt agaaacatta 1080 

ctatgaaatc agttttcaaa agacatactt cctct:ccgt:c ctgaggttrt ccggccctgt 1140 

tcaaaaggaa gaattcctgc ccgccataca gaaacucccr agcactcccc gacctcaagc 1200 

tttcctaaaa artctgtctg tgtgaaaagt acaagaacaa caatacccac aacttccacc 1260 

ttcgtaaccc acgttcacct acgacccgga cccataaaca ccacctggca taacgtt^rt 1320. 

catttccttt: aatgtctctg tcctctggcc czaccatczg tctcgttittt gtctttatct 13-80 

atatcttggt agatgtacct caccccraga gcaggtcagc czcczzcccc taatgcgaac 1440 

gdttgttttg ttagggaagg gcttccccca acCwCgcgrg aaaccgtgac gttgaagcga 1500 

acaaatgcct attgtgtaac aaaaaaaaaa aaaaaaaaaa a 1541 



<210> 25 
<211> 2079 
<212> DNA 

<213> Kono sapiens 



<400> 25 

ggcacgaggg aggcgcggct gcggcacctg accagaccct acgacaaggt accttccccg 60 

catgaggact caacaacccc cgtiggccaac cczctgctzg cazzzaczct caccaaacgc 120 

ccgcagcctg actggaggaa tgcggiacac agcctggagg ccagtgagaa catccgagct 180 

ctgaaggatg gctatgagaa ggcggagcaa gaccccccag ccctcgagga ccccgaggga 240 

gcagcaaggg ccctgacgcg gctgcaggac gzgcacatcc tcaacgcgaa aggcccggcc 300 

cgaggcgccc ttcagagagc cacrggctcc gccatcacig acctgtacag ccccaaacgg 360 

cccttttctc tcacagggga tgactgc'tc caagtcggca aggiggccta tgacatgggg 420 

gatcattacc acgccattcc atggccggag gaggccgtca gcctcctccg aggacctcac 480 

ggagagngga agacagagga tgaggcaag:: ctagaagatg ccccggatca ctiuggccttt 540 

gcctatcccc gggcaggaaa tgtictcgcgc gcccccagcc cccctcggga gccccctctc 600 

tacagcccag ataataagag gatggccagg aatgticctga aacatgaaag gcccccggca 660 

gagagcccca accacgtggc agccgaggcc gtcarccaga ggcccaatat accccacctg 720 

cagaccagag acacctacga ggggccacgc cagaccctgg gttcccagcc caccctccac 780 
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cagaccccca gcccccactg ctcccatgag accaattcca acgcccacct gctgcticcag 840 

cccatccgga aggaggccac ccacccggag ccctacaccg ctccccacca cgacctcgcc 900 

agtgactcag aggctcagaa aattagagaa cttgcagaac catggccaca gaggtcagcg . 960 

gcggcatcag gggagaagca gtcacaagcg gagtaccgca tcagcaaaag tgcctggctg 1020 

aaggacactg ttgacctaaa acrggcgacc ctcaaccacc gcattgctgc cctcacaggc 1080 

cctgatgtcc ggcctcccta tgcagagcan ctgcaggtgg tgaactatgg catcggagga 1140 

cactacgagc ctcacctcga ccatgctacg tcaccaagca gcccccccta cagaatgaag 1200 

tcaggaaacc gagtcgcaac acccacgacc catccgagcL cggcggaagc cggaggagcc 1260 

acagccttca tctatgccaa cctcagcgcg cctgtggtta ggaatgcagc accgcnttgg 1320 

tggaacctgc acaggagtgg tgaaggggac agtgacacac ttcatgctgg ctgtcctgcc 1380 

ctggtgggag ataagtgggt ggccaacaag tggacacacg agcatggaca ggaatcccgc 1440 

agaccctgca gctccagccc tgaagaccga actgttggca gagagaagcc ggtggagtcc 1500 

tgtggccttc cagagaagcc aggagccaaa agctggggta ggagaggaga aagcagagca 1560 

gcctcctgga agaaggcccc gncagcctcg tctgtgcccc gcaaatcaga ggcaagggag 1620 

aggttgttac caggggacac tgagaatgca cacttgacci: gccccagcca cggaagtcag 1680 

agtaggatgc acagtacaaa ggagggggga gtggaggcct gagagggaag ttcccggagt 1740 

tcagataccc tccgttggga acaggacatc rcaacagtct caggttcgat cagcgggtct 1800 

tttggcactt tgaaccccga ccacagggac caagaagcgg caatgaggac acctgcagga 1860 

ggggctagcc tgactcccag aacttcaaga ctttictcccc accgcctcct gctgcagccc 1920 

aagcagggag cgcccccccc ccagaagcac accccagacg agcggtacat tatataagga 1980 

tttttttcaa gntgaaaaca actctctttc cttcttgcat gacggttttt taacacagtc 2040 

attaaaaacg tctacaaacc aaaaaaaaaa aaaaaaaaa 2079 



<210> 26 
<211> 1947 
<212> DNA 

<213> Homo sapiens 



<400> 26 

tgtaaacaga ttggagaacc tagcaataag attcaaagct aatctggagc ataaaggcac 60 

agttcagaga cagaataaca gggatcacaa gcatgaacta aaaggaattt attcgcttca 120 

agttcctaga tacaaccctc ccatgccgca cttctccact gtcggagcac gttccgaaaa 180 

acagaatgcc ctgatccctg gcgggtgcga aggcagtngt cagggacggc aggcattggt 240 

gggccccaaa agacgaaggc cccacacaca ggtgtgctgc attcgggarc tgcgtgggtg 300 

tttcttggac cctttcttct gggagtaggg cacacaccaa cgcccaatcc gctgtccggg 360 

tgcacgtcca cagcacggtg gccaaacticg aacaccactg caaataggac gctgagcagg 420 

tccgtctgtc atgtcacgcc actgcacagg tcczzgzccc cacacgacgg ggagtactcg 480 

cgtcagacgc cattgaacag cccgtcicgg gcaggggaag cggggagccg gggatatitaa 540 

ttgggggtct riaactccarc a-cacgccag crgacattiac gaccar.ai:aa cgcagccaga 600 

gacaactttt accttgcrta tagraaaggt ccagcctgcc aac-gtaaac catcccaatc 660 

tggcaggctr actitttgaca rcggaaaggg cagaaagcga zzzgccccag tagtgcaaca 720 

ggagttatag accagaggcr gaaacccaaa ctatataaaa aggaactcag tggagggggc 780 

tttgraatct ccattaactt gtgccgctac ctccaggacc accaaaaatr acacgtaact 840 

ttacatgcta aacacattga aacacaacct atgtttatiaa agcataacgg gcttcccctc 900 

cagaagcccc cctgctcgtc acgaagtgag aacaatgaaa agccacagca gacactcagt 960 

tcaactctgt gcagaaccca gtagtgcriwg agcrgccac" cagatttgaa ttcagactgt: 1020 

gtgtcgtttg cttacggaca ctgcctgtcg ctctgccact gt::aaattaa tgagtctata 1080 

aggtttttct tccagaggcc ataggtgaca tcactaaaat tgcaagacaa accgcaacct 1140 

ttgccgctgc cgcactcccc aacctctccc ccaccccccg tggcgcgccg cctcccagat 1200 

gagcgcgtt:: tggagcaggc ccacctggga caccctacgc tcrcaccaag gaagtgcgat 1260 

ctgagcagcc acaatccagc caaaagagga ccgcagatat ttgc^ctgan caactagatg 1320 

aaaatacagc agaatggacc tagcccactg ccccgcttca tccaaccgag zczczgacca * 1380 

gcaattggtg cataaccacc acagcaaaag tcaagaaacg aaaccgtagc aactatgcaa 1440 

atgaacgcg^: cggcctccca atacctgtca ccagtggacc tcctgtgagg aagttagtct 1500 

tttgtcttga cgaaatgccc tcgctcttta aatcttaact ctgctgccca catcctccca 1560 

aagcgcgccc acttcatctg ctcaatctaa atgaactttc ctcctcgtat gtatgaggtg 1620 

acztggtggg cggggcgggt ggtccctgtc tttgtgttcc ttctttctta gggcatctgt: 1680 

aggccccaaa ggacc::tt:cc ttcaggccac atccctcaga aagccttcaa tcttcccrtg 1740 
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ctntcgtttg t::tgct:t::tc craaagaana tcttcaaagc ccaaacttgt acatcaattt 

aggactatut agaagcatag gccgtcgttg gcggcagcag tatatcctga aacgtctcac 

agatacatac cttcgaataa agatggcgct gttgaacaaa aaaaaaaaaa aaaaaaaaac 
ccgagggggg gcccggtacc caattcg 



1800 
1860 
1920 
1947 



<210> 27 
<211> 3379 
<212> DNA 

<213> Homo sapiens 
<400> 27 

atgatattca agtcgaattc ccctcactaa agggaaaaaa gccggagctc caccgcggcg 60 

gcg.gccgctc caaacctact ggaccccccg ggatgcagga atccggcacg aggtctcggc 120 

tccccccggc acccgcccgg cccggctgct cccggctcct cccggccacg gggagccgcg 180 

cgcggccgct gctgctctgg ggctgcacgg tggtggccgc aggactgagt ggagtagccg 240 

gagtgagttc ccgctgtgaa aaagcccgca accctcggac gggaaacctg gccccggggc 300 

gaaaactccg ggcagacacc acctgcggcc agaatgccac cgaactgtac tigcttctaca 360 

gtgagaacac ggacctgacc tgccggcagc ccaaacgcga caagcgcaac gctgcccacc 420 

ctcacctggc ccacccgcca tccgccatgg cagacccatc cctccggttt: cctcgcacat 480 

ggcggcagtc cgcggaggat gcgcacagag aaaagaccca gctagacctg gaagccgaat 540 

tctacttcac tcacctaact gcgargccca agtcccccag gccggctgcc acggcgctgg 600 

accgctccca ggactctggg aaaacatgga agcctcataa gtacttcgcg accaactgct 660 

ccgctacact cggcctggaa gaLgatgtcg ccaagaaagg cgctacctgt acctccaaat 720 

actccagtcc ctctccacgc actggaagga aggttactct: caaagctttg ccaccaccat 780 

acgatacaga gaacccttac agtgccaaag ctcaggagca gccgaagacc accaacctcc 840 

cgcgtgcagc cgctgaaacg acagtcttgc ccctgtcaga gaaatgacCw gaacgaagag 900 

cctcaacatt ttacacacta tgccacctac gatttcactg tcaagggcag cngctcctgc 960 

aatggccacg ctgatcaatg catacctgct catggcrcca gacctgccaa ggccccagga 1020 

acactccaca tggtccatgg gaagtgcacg cgttagcaca acacagcagg cagccactgc 1080 

cagcactgtg ccccgttata caatgaccgg ccatgggagg cagctgatgg caaaacgggg 1140 

gctcccaacg agtgcagaac ctgcaagtge aatgggcatg ccgacacctg tcacttcgac 1200 

gttaacgcgt gggaggcatc agggaatcgt agtggtggcg nctgtgatga ccgncagcac 1260- 

aacacagaag gacagtattg ccagaggcgc aagccaggct tctatcgcga cctgcggaga 1320 

ccctcctcag ctccagacgc t::gcaaaccg cgcccccgcc acccagcagg accagctgtc 1380-. 

cttcctgcca actcagtgac cttctgcgac cccagcaatg gtgactgccc tcgcaagcct 1440 

ggggtggcag ggcgacgttg tgacaggcgc acggtggga:: accggggctt cggagaccat 1500 

ggctgtcgac cacgcgactg tgcagggagc tgegacccta tcaccggaga ccgcatcagc 1560 

agccacacag acatagaccg gtatcatgaa gctcccgacc tccgtcccgr gcacaacaag 1620 

agcgaaccag cccgggagcg ggaggatgcg caggggrttic ctgcacttct acacncaggt 1680 

aaatgcgaac gcaaggaaca gacactagga aangccaagg cattctgtgg aacgaaatac 1740 

tcatacgcgc Caaaaataaa gaittcatca gcccangaca aaggcaccca cgtcgaggtc 1800 

aatgtgaaga tcaaaaaggt cttaaaaccc accaaaccga agatnttccg agggaaagcg 1860 

aacattatac ccagaaccat ggacggacag aggatgcacc cgcccaatcc ccaatcccgg 1920 

tctggaatac cttgtagcag gacacgagga tatiaagaaca ggcaaaccaa rtgtgaatat 1980 

gaaaagcctt gtccagcacc ggaaaccctc cctcggaaga aaagtcaugg atattttaaa 2040 

aagagagtgc aagcagcart aagatggaca gcacacaatg gcacncggct: atgtccaaaa 2100 

cacaaacttc agagcaagaa gacctcagac aggaaaccgg aactttttaa agtgccaaaa 2160 

cacatagaaa cgttcgaacg cacgggcccc tacccaactt atctctcccg gacccacgct 2220 

taaatacagt tccattccat gaagagaaat gaaaaccccc acacngacac czgztztcza 2280 

tgggaccgac tctgaaattc ttaactatca agaatatttc aatagcagca cgacacttag 2340 

cagcaatcca tcaagggcag cacctccaac aaggacgcct tccagcctca accatgtcac 2400 

ctacgcctga tgccactcaa agtaatgaat gacgtttcaa ggaaccccca accccactat 2460 

cagaaaaggt gttcgtcaaa gagcctcctic ttgtgtgcca cgcacgaacc ccggtctgta 2520 

ggtgttaaac ggaccctctc cacgtgcaca tagtactccc ctgtataaag cactctacca 2580 

cctaccacct gtgttgtgaa cgtctggcga ccgccgccga aagaaggaaa agggtgtgcg 2640 

agaaagccta ctgaagcagc agcaccgcca ccacatgcgg acaaaagtga acatataaaa 2700 

gaagtcgtgc cactcaactc cgaacacttg gagaaac.tag gtgaagatgc aaccagaaag 2-760 

gagaatatgt atgcgtgaag tctcagcttc gagctggagg ctagattcca agatgacagc 2 820 
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catgatgaaa cttcttaaaa aactaaacca gaagagactt caaaacaaga gaaagaaatc 2880 

ataaatgcag acacatgcct ggccaaaggg gaaacggact ctaaactcta aagagcccat 2940 

ttgcaatgca cctgcataca ccccaaaaac tactgtagac acagaatttg tcacactcct 3000 

gtgcttagta cttaaacctg aacattgaaa cagctctcct ccttgccttt cttaacagta 3060 

atagtcatta tatttacctg cctcttaaca caacgtatgt gatagccaaa aaatcacagt 3120 

ttttcattat tattcatcct ctgtacccac gcataaccac tacacatagt trcctctgca 3180 

cctgaatata caaaacacga acacagcgcc acacgaacaa ttccacatac agaaccctct: 3240 

tttccctgaa gtcctgtgga cLtgcaaaca tatatataca ttgccctgct: aatttgctct 3300 

tacattccat atatgtaaca aaggaatatg atctgaaaaa aaaaaaaaaa aaaaactcga 3 3 60 

gggggggccc gtacccaaa 3379 



<210> 23 

<211> 2006 

<212> DNA 

<213> Homo sapiens 



<400> 28 

cccacgcgtc cgcggacgcg tgggattttc tcttaagaaa aaaaaaggaa gacgttttaa 60 

taggaaaaga aaaacaagaa gcagtagcag ccaggtaact taaacgtcct cctttctttt 120 

tcctaagaga aaatggaaca tttaggttaa acgtcttcaa at'rttaccac ctaacaacac 180 

tacatgccca taaaatacat ccagccagca ctgt:at:ttca aaacccctcg aaargatgat 240 

atcagggtta aaatcacttg tattgtrtct gaagtttgct ct^rgr^flaact: actgntitgag 300 

cactgaaacg ttacaaacgc ctaataggca ttcgagactg agcaaggcta ctcgctatct 360 

catgaaatgc ctgttgccga gccatcctga acagaaatat ttraaagtat caaaagcaga 420 

tcttagttta agggagtttg gaaaaggaat catatctctc ctttccctga tcccgcactc 480 

aacaagtctt gatggaatca aaatactctg ctttattctg gcgagcctgc tagccaatac 540' 

aagtatcgga caggtaataa tttgccatcc ttaatatcag taaaacgaat caagacatta 600 

taggattaaa cataatttca tacggctagc actccattgg ccgacccaaa tttatagcgt 660 

gtggaaattg agaaaaatga agaaacagga cagatatatg atgaactaaa aatatatata 720 

ggtcaacttt ggtctgaaac ccctgaggtg tttttaacct gccacaccaa cttgtacact 780 

aatctatctc tttagtctag aaatagtaaa ctgttrgcaa gtcaccaaca atcattagat 840 

aaattatttc cttggccata gccgataatt ttgcaatcag caczaagtgr atacgcattt 900 

utgccacttt ttccccagat gattaaagta agtcaacagc c tiazt: itagg aaactggaaa 960 

ag-taacaggg aaagagattt cactatttgc tccatcagcg gtaggggggc ggtgactgca 1020 

actgtgttag cagaaactca cagagaatgg ggatctaagg tcagcagaga aacttggaaa 1080 

gctctgtgtt aggatcttgc cggcagaatt aaccttctgc aaaagtctta cacacagata 1140 

tttgtattaa atccggagcc acagccagaa gacccagacc ataactggct tacttutcca 1200 

tttccgtaac tartgcaatc tccaccttrg taacaactt: r gac-iraaaac acaaanctac 1260 

ttactcatct uuttaatagr caaaaaCcct tgcrgtcgca gcccgcaacc cccaaaacga 1320 

ctgcgttgct tttaggantg accagaagaa acactccaaa aairgagatg aaatgttggt 1380 

gcagccagtt ataagtaata tagttaacaa gcaaaaaaag cgccgccacc ttttargacg 1440 

atcttctaaa tggagaaaca tctggctgca tccacataga cctttatgct ccgttttcag 1500 

ttgaaaactt gccccctccg gcaacatccg caaacgaagc agaatccttit tt.cccctttt 1560 

ttccaaatac gttagttttg crcttgtaag acgtaccatg ggtactggtg ctgtgtaatg 1620 

aacaacgaat tttaatcagc acgtggttca gaacatacaa cgtcaggtcc ttaaaaagca 1680 

tcttgatggc ccctctccac ccacaacttc agacrtitcat aaagcgtacc aagaactcca 1740 

taaattcgtc ttcagtgaac tgctttctgc cacggtaggc caccaaacac agcacttact 1800 

cctaaaaatg aaaatttcrg atcacctagg acaccgacac atzccaactt gcagtgtctt 1860 

tttgactgga catactaacg ttcctccgaa tggcaccgat agacggtcca gaagagaaac 1920 

tcaacgaaat aaagagaaca cctattcaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1980 

aaagggcggc cgccccagag gatccc 2006 



<210> 29 

<211> 3070 

<212> DNA 

<213> Homo sapiens 
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<400> 29 

gttggagcct: tagacctccg aagcgaccca tagaaggtac gcccgcaggt accgg::ccgg 60 

aattcccggg tcgacccacg cgcccgcaac gtatgaggag tcngatttcz ccacaccctt 120 

gtcaacacct gttgttgtcg ccgctgtcgt tgtttttaat tttagctatc ccagcagacg 180 

tgaagtggta tctcgtgctt tttactcgca tttctccaat gaccagtgac gccgaacacc 240 

ttttcatgcg cttgttggcc atccgcacat ccccttggag aaacgcctac caagtccttt 300 

gcccaacctc aaatcggctt gtccccttgt cgagctataa ccatcgaatt tcaatrgccg 360 

taaaacaccc catgagtaca tgcagcacgc actcaataac caccagctaa tactatagtc 420 

gtaatcattt ttgccacatg tggcctgaac cggaaggaaa ccgcctagtr cttccaagta 480 

ggagcggaga tgtaacaaaa cctgcantta aggccaacaa agatttgact ccgtcagtgt 540 

tgatactttg cctttaattg acgcccccaa atcaagaaat tagaaacgtg tacacataag 600 

aatcagttta tctttggcag aaatcagtac atcatcaatg xttaattaac ccctaagaat 660 

taatggcaat agtgcctcta gtttcagccg agattcagac catctttctt catcctgccc 720 

tgctatgcga catttgctat aaatgccaca atgagaactt tttgaaacct gggtcagaga 780 

agcctacttt ttatacttaa taagcacgcg attgatgrat tatcactggc ctatgcccat 840 

tgcatcttigg ttttatgggg cgttcaaaag aagatagatg agaacaggtg atccaaagca 900 

atttagaatt ttatccccca gctgggactg aaccatcttg aactgatgaa tatattccac 960 

caggtttata ggtagatatg ccgttccact atatcggcga tttggagtga gacagtagct 1020 

tttaagcaga acaacagcca cacgtaagrc caaattcggc tattttcttt gaataagcaa 1080 

tggaaactgt atgcatccag gaaccggcga ttcagacatc ccccgaaggc cczczczccc 1140 

ctgtgccccc acagcttttg rcccgaaccc zcczgtctcc ctgcgacgcc. tctgtgccca 1200 

tacc::att:tt gtgaaccctg cccgcaacgc cgtcccaagc cagtrggtga crcctcaaac 1260 

ctcactgaag tctttcgggg cggatccaaa cgccgctttt agccactaaa atctgncaca 1320 

tctgctacat actttcacaa cagcacctac cctagtgtag ttctaaaaat gcacttaaaa 1380 

atttttaaag cagtgcacgc atgcagcaaa caaaatcgga cagtacagaa ataaacggtg 1440 

aagacagcct cccagttttc ttccccacgg atagcactcc ccccttttta ccctcccaaa 1500 

gatattttgt gctgatacaa gaattuttat accgtatgaa gugcctatcc atcctatctt 1560:. 

ttacttcagg gcaatttacg agtaatgaaa aaattcaagc attacaaaaa cgcagagtgt 1620- 

gtgaagcaaa aggtcccctt cacnttgatc cctaacacta gccctttcca aacaacaccg 1680 , 

• tcctaattaa cattcctctc tccagaagta acctctgtca taagccttgt: ccgtgccttc 1740 

agattgttct gcttacactt atacgcccac ttaaaagtgt atatttttgt acctugctitg 1800 

ttcgtttgt::: tgagacagag cctcgccctt gttgcccagg ctgaagtgca atggcactat 1860. , 

ctctgctcac tacaacctgt gtctccccgg tccaagcgac tcccctggcc cagcctcctg 1920- 

agtagctagg attacaggcg cacgccacca tgcccggcta acttttgtat titctagcaga 1980,. 

gatggagttt caccatgttg gccaggctgg tcacgaaccc ctgacctcag gctatccacc 2040 

caccLtggcc tttcaaactg ccgggattac aaggcargag ccaccgcgcc cggccaagaa 2100 

tttttctacc gacaccacag tgagcLccct gcctctccgg aacgacgccc aczzzgczta 2160 

cgatcaaccc aagcaggact czzczzzccc tggacgcctc zcccczggzc _ tggaatctcc 2220 

cagtcctgcc agaactggcc zzzcccagaz gctgcaaact tccagt:::gaa cccczzzzzc 2280 

cgtgcggccc ctggggccgc gagaccaaaa cccacgagtc ccgtgcaccc tagacctttg 2340 

gaaggtgaga gcagggccct gagaaaaggc aaaacgccrc czczccczga cgacacccct 2400 

acccgcccca ctccccacca gaattgccag zggcczzzca ccacagcggc cczzcczgcc 2460 

tgagccccgc accgtcccag accacacaga agtccggcca cctcigggcg cccgggatgg 2520 

ccaccgaaga gaagcacgcc g-ccccgtct ccrcccggcc cccgccagaa aatcgaacaa 2580 

gcgcaattaa cacactgt::a ccgccgaagc ccgaaacccc caggacctgt cczzgazccz 2640 

tccagaaacc accaggtccg gcactcggag ccccccggag agggacctcc cagccgagcc 2700 

ctcaaagaac tccacgaaat caggaactgc tigatgaaat gzazczcczz gcacctggaa 2760 

gacgaagccc aaacacccac acczczgzcz cccccagggc tcgggacgtc cccagcagcc 2820 

cggccacgca gctccccagg cgggcccggg gaggcgggag cagggaccat czczgzcccc 2880 

cccaccccca ccccatccac c-cggagacc accctccccc agccagatac ggaacaaaac 2940 

tacagacgca gaaaaaaaaa aaaaaaaaaa gggcggccgc tccagaggai cccccgaggg 3000 

gccccagcct acgcgcgcac gcgacgtcac agccccctcc ccanagcgag ccgcattata 3060 

agggagcaaa 3070 



<2i0> 30 

<211> 2227 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> SITE 
<222> (289) 

<223> n equals a, eg, or c 



<400> 30 

cggacgcgtg ggccgaccca cgcgtccggg aaaaarggaa aaratgccgt gcaaaatctc 60 

gtitctgtgtc cgaattgccg caggctcaga ccttcatttg aggttccgtg cccgaatcgc 120 

cgtaggctca gacctccatt cgaggctatg ttctataagt taacgctgat cttgtgtgag 180 

ctttcggtag ctggagtaac acaggcggcc tcacagcgac ccctccagcg ccttccaagg 240 

cacatctgca gccagcgtaa tccccccggg agatgcctcc tcaaggccnc gccccagacc 300 

acgtggggar ggcctgacar ccaattccca ggctgtcccc acccttgrag agtgacccta 360 

aacgctagac agacggggaa cgggaaagaa aagaaagctg cagacctcaa gttaaaattc 420 

ccccaaaaac gtttttacct atctgctttt tctgaaagga taaaggcttt ttgaaaatta 480 

ttttctaaca aataacatga acacttctag aaaccccaga aaaacacaaa gtactcaaaa 540 

tagaaagaaa aattacccat tactctcnaa gccagcatta tccactgcgg cgcttttgga 600 

gttgggtgag gccgtagcct ctgccaagtc aaggagcccg gcggtggctg tggcattcct 660 

gcagggttgt ttttctctct: tngagacgga gcctcacccc tgccacccca gctggaatgt 720 

ggtggtgtaa acagctcacc gcagccttga ccctgaggct caagcgatcc ttctgcctcg 780 

gcctcctgag tagctgggac cccaggcgag agtcaccaca ccctgtccat gttcctgcag 840 

gtcctgacat gcgaggacgc tgtgtcctcc ccgccacact ttcttcttct ttcttgagac 900 

agacccttgc tccatcaccc aggccagagt gtggtsgtgc ni^^a^aggct cactgcagcc 960 

tcgaccctca ggctcaagcg atccccacgc ctcggacccc caaagcgctg ggaccacagg 1020 

cgagagtcac catgctggcc tgaatcttca gggcacttta cggctgaagt gccactcact 1080 

tarccatscc cgttccaaga gtgtaggcgg tcaccctgtc tctgycgctg acctggcctg 1140 

gaccctcggc tgtgagaggg aggggcgggc tgggccggag gaacctraag ccctcgtgat 1200 

gtcacaagcc catccggctg ggcatcccct gctgcgtcct gagccgcaca cgccccaggt 1260 

ggcccccaca gcagaggcga gccactgrag ggtgragggc ttccacggac ggtcttcagg 1320 

ggragaagaa gggcccaggc ccccaggaga ctcaggagac cagagcctgg ggtcaggggc 1380 

tyagcagggg ctyarccagg gctggatgtc cggagccagc cccgmagccc tgkgktcttt 1440 

gttcttcgca ctcccaccgt ccgtgcgaac agctccagcc ccacccgcgc ct.ccccgtgc 1500 

tgggccccat cagggagccc agaagacgtg cgcgcttctg aaattgggtc cctacatgcc 1560 

tttgtcccag cgcacctcgc tccctccatc tactaccgag actcaaatgc ctgtcttccc 1620 

cccagaggtt gacggatata ttcagacgtt acgacacgga ccaggacggc tggattcagg 1680 

tgtcgtacga acagtacctg tccatggtct tcagtatcgt acgaccccgg cctctcgtga 1740 

agagcagcac aacatggaaa gagccaaaac gtcacagtcc; ctacccgtga gggaatggag 1800 

cacaggtgca gttagatgct gttcttccct cagactrtgc cacgtgggga cccagctgta 1860 

catatgtgga taagctigact: aacggttrrg caactgcaac agcagcrgta ccgttccaat 1920 

gcagacatcg gattcggtga ccgtctcatt gtgccatigag gcaaacgcaa cgtctcaggc 1980 

attctgcttg caaaaaaacc taccacgcgc ttctctagat gcctccggyt ctacagcgca 2040 

aatgctttna tcagccaaca ggaactctaa aataacacgg aacctacaca aaaggctttt 2100 

catgtgcctc acctttctaa aaaggagcct atcgcaccca tcggaatatg tgacgcaagc 2160 

aataaaggga atgttagacg tgtaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2220 

aaaaaaa 2227 



<2i0> 31 

<211> 1288 

<212> DNA 

<213> Homo sapiens 



<400> 31 

cccccgggct: gcaggaatcc ggcacgagcc cgaccncccc agcctcatct ctccctcctc 60 

cgctctcgcc ccccgtgctc cagccaacgt ggcccgncac ccgtccacct gccacaccgt 120 

cctgacccca ggccLtcgcc tgtgctatgg cctccgtcgg gaccacnctt gtctcccccc 180 

tgctgtgtct gctaattccc acccgtgcca gtgatccatg gctgcagaac acaccactcc 240 

atccatggaa aacaaccaca atcactgatt actatccccc cctgggcctc ccggggtgga 300 

ctgggctcag ccgggtggct cacctrgggg ccccagcagt catgggcaga cagtggctgg 360 
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ggtcactgca aagacticcc tgcatctctg gcagtegacg ctggccgcca cctgagacac 420 

ccacccaggg cccctccctg gggcctgggc ccctgcccag cctggttggg aggctccaag 480 

accaacatcc caagaaagac gagacagaag ccagaccacc cttttigggcc cggcttcaga 540 

agtcacccag caccactcct gctgcatcta ttccctaaaa cacaaacacc aaaccccatc 600 

ccttgatggg agggggcctc acggtttgta aacatgctct aaaccccact ccgcccggcc 660 

trggctcaac gtgtctggta acgtgtgggc tgtgaggctc cccgaacgta gaccccagac 720 

tgcaacgctg gccgctacag ggtctggcac acgggcccac gtcaggccca tcgccacagt 780 

gatggttgtt ctgcgaccgt tcctggtggc ctccgcccca caccccaggc tgacgctgtg 840 

ccccttccac cgggaccctc gggtggcttc catgcacttg tgccctaaat cctgccccta 900 

gactaaactt catcccctgt gttctcaccc cgcagcacgg ccgtcaggga acctgaccat 960 

ctgcagcgcg tcccgctgcc aaggtataac gtcagtgccc cccctcagtg gc^cccatgt 1020 

cacagaattg tcccgcagcc ctggcacatg tgcgccatgt -gggagctggg gcaggtccte 1080 

tttcatcctg cggctccgag ggagggggcc gctccttccc cagcctctac cctgacttgg 1140 

ccctcgtcct gcagccactc agagagcacg atggagccgg agcttcagtt ccgaccaaat 1200 

gcgtgtcgcc ggcctctgtg tgcgtgtgtg tgtgacagag ccagaccctg tctttaaaaa 1260 

aaaaaaaaaa aaactcgagg gggggccc 1288 



<210> 32 

<211> 3280 

<212> DNA 

<213> Homo sapiens 



<400> 32 

gcgtccggtg agaggcaagg accttccatc cttagggcat cccgaccacg ccctgcttca 60 

gagagactgt tcccggcgtc ccagtgctac ggggagcagg tLcctcctgg ccctgccccc 120 

aggtctcact gtcttactgg ctctgccagg accagaagcc aagaactctg gagcttcctg 180-- 

tcctccacgc cccaaatatg ccagctgcca caacagcacc caccgcactt gtgaagatgg 240 

ctctcgggcc aggtctggca ggacatactt tcatgactcc tctgagaagc gcgaagatat 300_ 

taatgaatgt gaaaccgggc tggcaaagcg caagtataaa gcatattgta. ggaataaagt 360 

tggaggttac atctgtagct gtttggtaaa atacacttta ttcaactttc cggctggtat 420 

tacagattat gaccatccgg attgttatga gaacaacagt caagggacga cacagtcaaa 480- 

cgtggatatt tgggtgagtg gggtgaagcc tggatttggg aaacagctgg tacgtataac 540 

tatgccactc ccctacccaa acattaacat gtctccctgt gaLtccragg gcagggtagt 600 

tccatccagg ggtaattttg tcctctgccc caaggccatc ::gtcaacgac cggggacact 660 • 

rctggttgcc ataccttggg ggtgatgcgt gtgactggca tctggcggac agagaccagg 720 

gacacagctc aacatcccac agcgcccagg acagcccccc acaaccaaga agcgcccagt 780 

gccacatgtc catagagaca gagaaataca agcgtaggcg gaaagcgccc cagctggcat: 840 

ccaaagacct acaccaagca cctagatcct caatgccaca cgcaccttgz agcaccccaa 900 

taaataccgc cgccttctgg gccccccacc ccaacacttg cgcacatccc ctacttctta 960 

cattccagca agagccaatc caattagcct aatcttcctg gaaggaaaac ctgagaagaa 1020 

acggaagcag agaggactitt gcaagaaggg ccactcaaci aatrcaaagc grggagtnga 1080 

gcaccLggaa tgcgagLttt gcntccccag gaaagggtca aarccccgaa tcrgatatag 1140 

tccatgaaac caagaggcgc aacgagacaa gggagaacgc tttcc-ggaa gctggaaaca 1200 

acaccatgga catcaactgt gctgatgctt caaaaggaaa cctaagagag agcactgcag .1260 

ctgccctatc acctaccaac ctcttgggga caccccgaai: gcacccttcc tcagtaaacg 1320 

aaaagggacg caggaagcaa aaccgaaccc ctacgccgcg agcggcacca ccggtttgaa 1380 

ggaaaaaatt cccctctctg aacctgcgtt cctgactctic cgccataatc agcctggtga 1440 

caagagaaca aaacacacct gcgccnactg ggagggatca gagggaggcc gccggcccac 1500 

ggagggctgc tcccatgtgc acagcaacgg ttctcacacc aaatigcaagc gctzccatcz 1560 

gcccagcctt gccgtcctcg cggctcttgc ccccaaggag gaccccgtgc cgaccgtgat 1620 

cacccaggtg gggccgacca cctctctgcc gcgcczczcc ccggccaccc ccacctcccc 1680 

cctgcgccgg cccatccaga acaccagcac ccccccccat ccagagcccc ccctctgcct 1740 

ctccccggcc cacctcctgt tcctgacggg catcaacaga accgagcccg aggcgctgtg 1800 

ccccaccatt gcagggccgc cgcacttcct ccacctggcc tgctccacct ggacgcccct 1860 

ggaagggctg cacctctzcc ccaccgtcag gaaccccaag gtggccaact acaccagcac 1920 

gggcagattc aagaagaggt tcacgcaccc tgtaggccac gggaccccag ccgtgatcat 1980 

tgctgtgcca gcaacagctg gaccccagaa ttatggaaca ttcacccacc gttggcccaa 2040 

gcccgataaa ggacccacct ggagcctcat ggggccagta gcagccatta cctcgataaa 2100 
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cccggtgtcc uacctccaag ctccgtggat cctgagaagc aaactttcct ccctcaataa 2160 

agaagtctcc accatccagg acaccagagc cacgacactt aaagccactt ctcagccatt 2220 

taccctgggc tgtccttggg gccccggtct tttnatggtt: gaagaagtag ggaagacgat 2280 

tggatcaatc attgcatact catccaccat caccaacacc cctcagggag tgttgctctt 2340 

tgcggtacac cgtctcctta accgccaggt tcgaatggaa tataaaaagc ggcctagtgg 2400 

gacgcggaaa ggggcagaaa ctgaaagcac cgagatgtct cgccccacca cccaaaccaa 2460 

aacggaagaa gtggggaagt cctcagaaac cttrcataaa ggaggcactg catcaccatc 2520 

cgcagagtca accaagcaac cgcagccaca ggttcatctc gtctccgccg ctnggccaaa 2580 

gatgaaccga cctggcaagc gccatggcaa cgacccggaa gttacccccc cctcccgctt 2640 

gtct.acagcg cccctgcggc cacacacaga ctggacaaat gccaccattc ctagccttcc 2700 

tgtgaaaagt ctaggcccac ccacctattt tggcttccta tgttcacaga aagaacaaga 2760 

cattngggag aattcttaga tccagagrcc agtagtgcgg cacgtgcaac gaagcgccgg 2820 

aaggatgcat tttaaagatg gcgggcggga gaagtggact . ttcttcttgc agctactgcc 2880 

accttgccag aaactcacta actggcaccc ggactcagct catagttccc tttctggcct 2940 

ctctgc!:gta ttttatgctc ccaaagacct cacaccaaca ctccacattc acataatcca 3000 

acaactttca tacggatcag cactaaagag ggcgctgcac ttrgcaatac aaaaatgcac 3060 

tatcaggtgc tggagaggat gtggagaaar aggaacactt ttacactgtt ggtgggactg 3120 

taaaccagtt caaccattgt ggaagccagt gcggcgattc cccagggatc taggaccagg 3180 

aataccattc gacacagcca tcccactacr gggratacac cccaagggct ataaatcacg 3240 

ctgctataaa gacacacgca cacgcaaaaa aaaaaaaaaa 3280 



<211> 1297 
<212> DNA- 
<213> Homo sapiens 



<400> 33 

ccacgcgtcc ggactgc::tt acggacattg gacgaagccg aagcatttag aacggcgcct 60 

ggcacacagt tggtgcgtga catggttaag ctttgtgtcc ccacccacat ctcatcttga 120 

atgtgacggc ttccccggct ccczcczgoc gccatgtgaa gaaggccgcr gcttcccctt 180 

caccttccac caccatgatt gccatggacg ctccccactc caaagcagcc ctggacagca 240 

ttaacgagcc gcccgagaac atcccgctgg agctgttcac gcacgtgccc gcccgccagc 300 

tgctgctgaa crgccgcctg gtctgcagcc cccggcggga ccrcatcgac ctcatgaccc 360 

tctggaaacg caagtgcccg cgagagggct ccaccaccaa ggactgggac cagcccgcgg 420 

ccgactggaa aacctcctat ctcctacgga gcctgcacag gaaccccctg cgcaacccgc 480 

gtgctgaaga ggatatgctc gcacggcaaa ccgacttcaa cggtggggac cgctggaagg 540 

tggcigagcct ccctggagcc cacgggacag attttcctga ccccaaagcc aagaagcatt: 600 

ccgtcacacc cnacgaaatg tgccccaagc cccagccggc ggaccttg-a gccgagggct 660 

actgggagga gccactagac acattccggc cggacaccgc ggctaaggac cggct-gctg 720 

ccagagccga ccgcggctgc acctaccaac rcaaagtgca gctggccccg gccgactacc 780 

tcgtgccggc cccccccgag cccccaccwg tgaccaccca acagcggaac aacgccacat 840 

ggacagaggt ctcctacacc cccccagact acccccgggg tgcccgccac atcccctccc 900 

agcatggggg cagggacacc cagcaciggg caggctggta cgggccccga gtcaccaaca 960 

gcagcatcgt cgtcagcccc aagatgacca ggaaccaggc cccctccgag gctcagcccg 1020 

ggcagaagca tggacaggag gaggctgccc aaccgcccta ccgagctgrt gcccagattc 1080 

tctgacagcc gtccatcctg tgtctgggtc agccagaggt zccrccaggc aggagctgag 1140 

catggggtgg gcagtgaggc cccr.gtacca gcgactcctg ccccggctca accccaccag 1200 

ctcgcggzaa ctcactgcca cacagctctg acgttttgtc gtaacaaacg ticctcaggcc 1260 

gggcaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1297 



<210> 34 
<211> 2184 
<2.12> DNA 

<213> Homo sapiens 
<400> 34 

ggcacgaggc gccfcgaggac gt-gctgc^gg ccgctgcccc cgctgcgggg gctgcccccc 60 
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gggacggcgg cggggggccc gggccgaacc tacccgcacc ggacccrcct ggacccggag 120 
ggcaagcact ggccgggccg gagccaccgg ggcagccaga ccgcctcccg cctccaggcg 180 
cgcactgcag gctacgtggg cttcggccic tcgcccaccg gggccatggc gtccgccgac 240 
atcgccgtgg gcggggtggc ccacgggcgg ccccacctcc aggattacut cacaaacgca 300 
aatagagagc cgaaaaaaga cgcccagcaa gactaccanc cagaacatgc cacggaaaat 3 60 

agcacacaca caataactga actraccaga gagccgcaca cacgcgacat aaatgacaag 420 
agtacaacgg acagcaccgt gagagcgacc cgggcccacc accacgaaga cgcaggagaa 480 
gctggcccca agtaccacga ctccaaiagg ggcaccaaga gtttgcggtt attgaaccct 540 
gagaaaacta gcgcgctatc tacagccrra ccacactttig acctggtaaa ccaggacgcc 600 
cccatcccaa acaaagatac aacacattgg tgccaaatgc tcaagactcc tgcgctccaa 660 
gaaaagcacc atgtaacaaa ggcrgagcca gtgatacaga gaggccatga gagcctggcg 720 
caccacatcc tgctctacca gcgcagcaac aaccccaacg aqagcgttci: ggagcccggc 780 
cacgagcgcL atcaccccaa" cacgcccgat gcattcctca cctgtgaaac tgtgatcttt 840 
gcctgggcta ttggcggaga gggctccrcc tatccacctc atgttggatt atccctcggc 900 
actccattag atccgcatta tgcgctccta gaagtccatt atgataatcc cacttatgag 960 

gaaggcttaa tagataattc tggactgagg ttactttaca caatggatac aaggaaatat 1020 

gatgccgggg tgattgaggc tggcctccgg gtgagcccct tccataccat ccctccaggg 1080 

atgcctgagt cccagtccga gggrcaccgc accttggagt gcctggaaga gctcrggaag 1140 

ccgaaaagcc aagtggaatc cacgcgtcrg ctgtccttct ccacgcccac ctggccggca 1200 

gagcaccagg ccgcgtcatt cccgaaaagg gaaggaaacg aaaccacttg cctatgarga 1260 

tgatt^tgac ttcaatttcc aggagtctca gtatctaaag gaagaacaaa caaccctacc 1320 

aggagataac ccaatcactg agtgccgcca caacacgaaa gat-agagccg agacgacctg 1380 

gggaggacca agcaccagga gtgaaacgrg cccctcacac cntctttact acccaagaat 1440 

taatctcacc cgatgtgcaa gtatcccaga caticacggaa caactccagt ccatcggggt 1500 

caaggagatc tacagaccag ccacgacctg gcctctcatc accaaaagtc ccaagcaaca 1560 

taaaaacctt tctctcatgg atgcracgaa taagctcaaa tggaccaaaa aggaaggtct 1620 

ctccttcaac aagctggccc tcagcctgcc agtgaacgcg agatgctcca agacagacaa 1680 

tgctgagcgg ccgaccccaa ggaacgacag cattaccrcc agatacagaa agaccccaca 1740 

aagccagaac cctttggcgt gcggcacgtc c^cttccccc ttcccccgcc acagagattt 1800 -. 

ctcccatcca acttgcctgt ctgcctcccg ccaactcagc tgcacgctga gcaccaagag 1860 

cttgtgacca aaactctgtt ggacttgaca atgttttcta tgatctgaac ctigccacticg 1920 

aagtacaggt taaagactgc gtccactccg ggcatgaaga gtgtggagac ctcticttccc 1980. 

cactttccct ccctcctttt ncctttccac gccacatgag agacaccaat caggttctct 2040- 

tctctttctt agaaatatct gacgtcacac aracacggtc aacaaaacaa aaccggcctg 2100 

actcaagaca accattttaa aaaatngggc cg-catgcgg gaacaaaaga actccttcct 2160 

ccctaaaaaa aaaaaaaaaa aaaa 2184 



<210> 35 
<211> 949 
<212> DMA 

<213> Ho.T.o sapiens 



<400> 35 

ggcacgagct cccactgacc ctacLcccca gagcgcagta ctagggcaag gtcccgccac 60 

tgcctccczc cacgaatgta zzzctccczc czgccctggg gacacgggga gcggcccgtt 120 

tctttcccca cctagtccca gaaagatggc' gcctiggcctit ccgttgctgg accLtcttct 180 

tttccttttt tttttgcacc aaagtggcaa ccaggccagr gtccggggac caagccggcc 240 

tcggggcggg gggcccccac ctgcczczcc ctggzzccca cagcgctagc gccccrgaaa 300 

agacaatart ctctctaaag caa-aagggg tgacgggccg gggggagcgt tcgctgctgc 360 

cggcccccag cccccctccc crgccaggcg tgggggagac ccccgctgcg accgaatgta 420 

acccccccac ccctgccgca gccaacgcag gggaaggggg acaczczzcc tgccccctct 480 

ccccagccaa agagactctg gactragggg gcccatgagc c::ggagaggc czzaaccczg 540 

t^aggaacta cagggggagc cctcticccac ccccatcccc ncccgagagt ggccaacgcc 600 

tacaagcccc tgagcccccc tgcccaggga czcagacccc gzzgczgzcc zzccccggcc 660 

ccggtcttcc tgggcccccg ctgcrccccc gccccccctg gggttggggc gggcgcaggg 720 

gccaccgcgt tccctgtctg ccctgcaccc acagtctccc cgccccctcc ccaccctgtg 780 

tgacctccct ctcttttacc cgcccccgca aacactcccc tcccccaata aaacctggtg 840 

tgcgttcccc aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 900 
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aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa 



949 



<210> 36 
<211> 3338 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 

<222> (2861) 

<223> n equals a,t,g. 



or c 



<220> 

<221> SITE 
<222:> (3328) 

<223> n equals a,t,g, or c 



<220> 

<221> SITE 

<222> (3330) 

<223> n equals a, eg. 



or c 



<400> 36 

taccgggaca 

ccggcccgag 

ccccgccaga 

gccccggcgc 

agaagttgta 

cagcgccttt 

agttgatacc 

gtttttgttg 

tgcaattgtg 

gtatgaaaaa 

cctcactgag 

ccctgcactg 

accacttctt 

gccaccacaa 

ttatctcaaa 

ctcaacggtt 

cttccaaact 

tgttttattt 

taatgggaat 

cttaacatat 

tggataacgg 

ctggtatttt 

ggttgtcgat 

tgataccaca 

ttacaaaaac 

ttagaaagtg 

tctgtgggtc 

tttggttttt 

tcatgtgctg 

cgtcttzagg 

ttgagagccc 

aaaaaaaagt: 

atcaagccat 

acagataggc 

ttccttatcc 



ttctggagtc 

gcgatggaga 
ggcccccgga 
gttctcaagg 
tcacaacgta 
cttctgagtc 
acaaaagtaa 
gcatccatca 
cttggatcta 
cgacaggagt 
ccacttaacg 
tggcgcccga 
ttggagagca 
agcaggccag 
cggagactt:!: 
gacaaatgat 
acgctctata 
ttaaaccaat 
tcaacatact 
attaccaggc 
ctattaaccc 
acgtgtataa 
gtcaagtaaa 
cctaaggacc 
aagctttgac 
gctcaacccc 
ctccttagcg 
cazctgttcg 
c tgcctcatic 
cggccctgaa 
ttgttcttac 
aaatacccag 
tctcaagaga 
agggcacr tg 
ccctcctaga 



cgagaagrca 

acggagcggt 
gcggcctcgc 
gcttgcagcc 
cttuatgtgg 
tccttacact 
aatcatcgga 
ttttcgtttc 
tagcaagttc 
cccagctgag 
cctaaagact 
gcccrggcag 
aggggaaggt 
ataagt::aaa 

gg^ggg^g?^ 
cccgccactc 
gccgtataaa 
ttgcttagcc 
taagaaacca 
tccactigtat: 
ttcgagaccc 
aagcaacaa:: 
aaaaagaztg 
aaagacaaga 
cacctaccaa 
cctgtcggca 
c -ttctrrga 
agacgcctcc 
cagcagaggr 
cctcgggccc 
acaggcatcg 
ggaaccccgc 
taactccaca 
ggacacgacc 
gttttgact r 



acggcgcggt 
gtacagcccc 
tgcccactrt 
gctgccgcct 
aggactttat 
gatcgtgtat 
tttttacact 
cacacatgac 
racgrticcta 
aaaaccrgaa 
ccggggagca 
aagcrcttgt 
caagaaggca 
aaaaacttcr 
ggagaaaaca 
cgrcacaggc 
ca::cgtgact 
gttcgu^ccg 
acacccaaaa 
r rcactcagc 
-cgtacagcc 
accagcaacc 
" tigcaaca 
aagactttut 
gcaccrgaag 
tatgatcatt 
agccaacccg 
tc tccg'ccc 
cggccrctgg 
tzrcgattgr 
ctccagcttg 
ccagactaac 
accagaattg 
tc tctgtcca 
gggact:c"ag 



tgctgcggcc 
actacggagg 
tccatgggcc 
ccgctggccc 
cttcccgagz 
cgcactccat 
actttgggaa 
aggacctcag 
crcgacttta 
aataccact-a 
gatgttacct 
aaaacr cgc" 
gtimazcaa 
ccc taaacaa 
a- cgt tttca 
cactcc::t:t:a 
acac -caccc 
acgcttagat 
cgctggccta 
cttaaatgct: 
racaaaatgt 
tcgcgtctat 
cacaaaaaaa 
gcccaagaca 
agatgagttc 
cccaccaaaa 
ccttt ct cag 
cctaccagac 
ctctgacacu 
gaacaccgrg 
cccctggcaa 
actgctggcg 
cctgttggCw 
ggtgactcac 
cgttaagatg 



gccgcgctcc 
aggacccggg 
ggctcccatc 

tcatctgtga 
tcgtaagctg 
tuCatgagag 
caggatgtgt 
ccgagactgc 
tcactargct 
gggccgaagc 
aaggcagcga 
aa" tgtrtaa 
taccgtgc za 
tiaarcgaaac 
aaccacacag 
ccaggctcag 
actcagaaac 
ratgttctgt 
ggcttttttc 
ataacatt tt: 
acgggagacg 
accgcacctc 
cggaagaaac 
gcgaaagtaa 
acaccacgac 
ccaacacagc 
gacaccagcc 
agaaanggag 
ccctgccagc 
ragcaggacc 
aaaaaaaaaa 
gcataagaga 
agcagccgtc 
agactagacc 
acgagcccgc 



60 
120 
180 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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gcatcaggtc cLtctgcacc ttggcggaag cctcccaggg taggttcccc atctgaaaca 2160 

gtggaaccat gtttccagtg acaaagctta acgaccccac cctcttctcc ttttctcatc 2220 

tgccatctgc gcgtcttaga tgggttccaa ttgcacgaat gcggctaatg cggttctcag 2280 

aaattggtca gtatggcccr acatagcctic tgccctgccc cactgaccca atacctctag 2340 

gatttgtatc agagctcgga taccag^gtt agcggcggtg ccaccaccac ctaactggga 2400 

gataacgaaa ccaaccacgg atgccgtctt cactgggcat: gccatctaag agaggagaaa 2460 

tagctgggtt ttgggtccaa ccatgaataa ggaccgattc agaaaacgag ccracggtag 2520 

gtagactaaa gtcccacatc agac?:g::acc atcgcgactc agacccacct aaaactcaga 2580 

gcacaccatc cgggctacct cagggccacc acccacgtat tgggcctagt caggaccgac 2640 

agatacattc ccagctggcc tgccacacaa aacatactgt catcgagctt aagctccgct 2700 

tgttctgagg ctccacctcc acgtgcttca rtggcgcaaa agtggatctc ctagctggtc 2760 

acttaattct ttctttttca gaaagacagt atgttcactg 'gtatatttgg ccactctcag 2820 

aaccttcctt cacattgttt ttcacgggac ccacgaatgg "nttagccttt cttttctatt 2880 

gtagaaggaa ataaatagga gcaaaaagac cacrgtagta aataagttca aggggaactt 2940 

gggaccagaa accactgtta cgcacaaaaa aarggcaaat ccaataaact caaatttaaa 3000 

ataattttta aattaacagt tacgataaat tctacatttt atacaaatag attgcttaga 3060 

atggttctca agaattacaa gagaaatgaa ctcacagtac aaaaatttta caactactat 3120 

acttgtgttc tgtctggggg ccgggaaacg tatttrcaca ctgtagccaa tcatttcata 3180 

tttgtcaatL taaatcctac gggccttcrt: tttrccatcc cccccgacgt cagatcttat 3240 

agtctttrta aataaatcca cttaattaaa acgt caaaaa . aaaaaaaaaa aaaaaaaaaa 3300 

gggcggccgc tcgcgaatct agaactangn cccacgcg 3338 



<210> 37 
<211> 1563 

<212> DMA 

<213> Homo sapiens 



<400> 37 

cggcacgagg agaaaaggac tcagaccttc cagattggct ccagaaggca gctgctgtca 60 

gtgattttcc ctacagacca ctgcagccac ctacctcgcg ccacggcccc gagacagcat 120 

ggcgtcttcc cccacacgga cccgcaggtg gccccctagg agctntctcg ctgaaggagg 180 

tcatccctct cagtcacatc tgagcagcac ccacccaggc crcggccccg agggtcatgc 240- 

gggggcacct ggccgggttc ccggccccgc ccggcctggc arctgtctgc cttcgggcta 300 

ccttttcggc acagctccct gggcctgcgg ctgccacaag ccggacccca gctcccctgg 360" 

gc.tgcagtgc tgcccgcagt ggtccagaaa agaggccggg cacagcggcz cctggctcag 420 

cagcaccctc ggcgcaggct ggcccaggag crccgcgcag ggcgccgcca gctgacccag 480 

ccccggccgc acccaacgcc cgagaaccgg gccggctggg gggcctcttic gacggagctc 540 

tactccaagc cccgctgaat crcccgagga aaagcaciqa tgcticccacg gacacaaggg 600 

aggcagaatc cctcgaagta gaacaaaccg caczcczzzc cagccttccr tcctcagctt 660 

ctcctggacc gccagcccct ccacgcgcct cagacactcc caacgccgca ccgcgggggt 720 

tcctcccgcc ccggcccatc tcazcccccz cggcrgcrga cacagccccc ccgtgcgcat 780 

gtcgccctgt ggcagatgcc cagggacctc ccacacccca ggatcacccc gagtacacrg 840 

accacacagc ctgccgcaca aactgcagcg ccaccraagg accaggaacc cgacattgcc 900 

ggaggtgaag ggatccttcg cgacacagct ccccrccagg aggaccaccc accgggcgtc 960 

gggggagctt ccgccccaag cagcaggagg gagczgccaa ggcgcggagc tcacacccaa 1020 

acactcccag aggacggcac cctgcatggg acgccatcca gcccctccga ctgtggaatc 1080 

aagtacatta tcagccggcc cccggccccc cgctgcgacc zccczzcgcz cgaaccgagc 1140 

cttgtgtgca agggcgcatc aagtcgtacg ggcttcgccg crgggtgacc ccaaacgcaa 1200 

ctccaggaag catgggctac caacgactaa ggtitaacgac cagccgggcg cggcgctgac 1260 

gcccgtaacc ccagcacttt gggaggccga ggcgaacgga ccacgaggcc aggagttcaa 1320 

gaccagcc::g gccaacatgg cgaaaccccc accrc^acta aaaacacaaa aattiagccgg 1380 

gcgtggtggc gggcgcctgc agtcccagc:: acccaagagg ctgaggcagg agaatcgcgn 1440 

gaactcggga ggcgcagccL gcagcgagcc gagaccccgc ccgctgcact ccagcctggg 1500 

cggcagagcg agaccccgtc ccaaaaaaaa aaaaaaaaaa aaaaaaaacc cgaggggggg 1560 

ccc 1563 



<210> 38 
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<211> 1048 

<212> DNA 

<213> Homo sapiens 



<400> 38 

cccgggtcga cccacgcgtc cgacaaaaaa caaggtatgt gcgtgccttg ggatgctttt 60 

ctgggctnac cccacctgtg tcctcacaga ttctctgccc tgccagccat gtcctcggtc 120 

cacaggagcc acttcccact ccaacccccc cacaacctcc ccactgttica ctctcttcat 180 

gccctgtgct cttgccccaa atcccctcac ccaattggga aaactggatg acagataagt 240 

agtttcttat: ttacatgcat ctccctggaa accaaagtcg ctttccagct agtactccct 300 

cacgtcggaa gcacgtgtaa aataacacat ggaaactact cccacctggc acacaagaaa 360 

aaaagggcct gcctatattg caaccgactc ctacactctg- catgttttag tctgacccac 420 

aattttttct tttggatagt tgaagcaaaa aacgtggcta caagatgacc aaagttcggc 480 

ctacaattcc ctgtttggac aacaggaaga attgtgcagt gttttgcaac aggactaatt 540 

ctagattccc actgccctta aagataacna gagggaaagg gtcttctctt tctttttcat 600 

ttatcaaaga tacttacagg ctcctcagaa gaagtgatgg gccctttagg taacgaaata 660 

ggtgttgatg gtgttatggg tgatgatgta actggcggcg gctgcacaaa gtctccatcc 720 

ctaaacatca gaactgagag ttgggtcact gatttaaaag aaaacgtgta ttaatctaca 780 

gtccaccacc ttgactacca ggaacticgcc acaaccaccc catatcctgg cacaggattc 840 

aaaaggcaaa acctggaacc caaccacgcg tagccaaata ctaagcttcc tcccagacca 900 

aacagnacaa taaacactcc ctattiaagga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 960 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1020 

aaaaaaaaaa aaaaaaaagg gcggccgc 1045 



<210> 39 

<211> 1430 

<212> DNA 

<213> Homo sapiens 



<400> 39 

cggcacgagg agcagctgag tcccctccct gtctttcact cccctggcat cggtggtttt 60 

acctcttcga ttgaaccctg cttccccgac ccccctggga ggccgccttc ttcaggcgcc 120 

tcccttctct ccacgagccc gctctgacag ccgaggaact ggcaagaccc cgctacccag 180 

agggcgaatg ggtatctttc ccggaacaac cctaattttt ctaagggtga agtttigcaac 240 

ggcggccgcg attgtaagcg gacaccagaa aagtaccacc gcaagtca^g agatgtccgg 300 

cctgaatcgg aaacccctcg natacggcgg ccttgcczcz a"cgcggcrg agcttgggac 360 

tttccccg~g gaccttacca aaacacgacc tcaggcccaa ggccaaagca ccgacgcccg 420 

ccccaaagag ataaaataca gagggatgrc ccacgcgczg ttrcgcatcc gcaaagagga 480 

aggtgcaccg gctctctatc caggaacrgc ccctgcgtcg ccaagacaag catcatatgg 540 

caccattaaa attgggattt accaaagctt gaagcgccta ttcgtagaac gttcagaaga 600 

tgaaaccctu ttaattaata cgacctgtgg ggtagtgtca ggagcgacac cttccactat 660 

agccaacccc accgacgctc taaagatccg aatgcaggcc caaggaagc" tgtcccaagg 720 

gagcatgact ggaagcctta tcgacataca ccaacaagaa ggcaccaggg gtctgtggag 780 

ggcaagtacc cttttcctgc cattatccta cactctcagc ccccacaatt tgcagagaat 840 

tctttcttac ataaagacac aaaatcgcga attaeaaccc aaaaactaag gtaagaaact 900 

cctcatctcc cttgaaaggc ccaaaaccca tcat-ggccc tittatctccg cataatgttc 960 

ggggattaca taggtgggga aagtcacrac atcatctgag acggctgttt cgancatatt 1020 

cacagtgaat gtagttgttc agtgtarttt tttgcaagtt ctgtactaac acgatgatgt 1080 

atgtccctgt agtgcccatg ctcaaaagcc gccaccggcc gcgcgctgtg gcccatgccc 1140 

gtagtcccag cactctggga ggccaacgcg ggcggatcac ccgaggccag aagttcaaga 1200 

ccagcccggc caacctggtg aaaccccatic tcaaccagaa acacaaaaat tagccaggca 1260 

^^ggtggtgca cggctgtggt cacagctact caggaggccg aggcgggaga attgttcgag 1320 

ccccggaggc ggaggttgca gcgagccaag atcatgccac tgcaccccag cccgggtgac 1380 

99ggcgagac tctgtctcaa aaaaaaaaaa aaaaaaaaaa aaaacccgag 1430 



<210> 40 
<211> 2103 
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<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (2101) 

<223> n equals a,c,g, or c 
<220> 

<221> SITE 
<222> (2102) 

<223> n equals a,t,g, or c 



<400> 40 

ttcctcgtag cgagcctagt ggcgggcgtc cgcattgaaa cgtgagcgcg acccgacccc 60 
aaagagtggg gagcaaaggg aggacagagc cct"taaaac gaggcgggtg gtgcctgccc 120 
ctttaagggc ggggcgcccg gacgactgca tctgagcccc agactgcccc gagtttccgt 180 
cgcaggctgc gaggaaaggc cccnaggctg ggtctgggcg cttggcggcg gcggcttcct 240 
ccccgctcgt cctccccggg cccagaggca cctcggctcc agtcatgctg agcagagtat 300 
ggaagcacct gactacgaac gctatccgtg cgagaacagc cattccacga gaggacccgc 360 
gagtgtatta tatcaacact cccgtcrgca acactgtaca ccctctgcca catcttcctg 420 
acccgcttca agaagcccgc tgagtccacc acagggtgtc ccgggccggg cccmtgagac 480 
agtggtgatg ttgacgcccc tcactctgcc ggtgccaggc acggtgcggg cggcatcagc 540 
cattgcggac aagaacaagg ccaacagaga gtcacticcac gacccrtggg agtactacct 600 
cccctacctc tactcatgca cctcctrcccc cggggttccg ctgcccctgg ccgctggaag 660 
acctggagga gcagctgtac tgctcagccc ctgaggaggc agccctgacc cgcaggatct 720 
gtaatcctac ttcctgctgg ctgcctttag acatggagcc gctacacaga caggtcctgg 780 
ctctgcagac acagagggtc ctgccgggta tgtggcctcg cagggcttgg gatacccggg 840 
tttccccaag gagagtagcc cctggtccca ggtgcrtigct gacagcctcc catccctgca 900 
cagagaagag gcggaaggct ccagcccgkc aacggaacct gggccacccc ctggctatgc 960 

cgtgcttgct ggtgctgacg ggcccgtctg tgcccattgt ggccacccac accctggagc 1020 

tgctcaccga tgaggccgcc atgccccgag gcacgcaggg taccccctta ggccaggtct 108.0... 

ccttctccaa gcrgggcccc tttggcgccg ccattcaggc tgtacccacc ctttacctaa 114(5" ' 

tggtgtcccc agttgtgggc ttccacagct ctccacrctt: ccggagcctg cggcccagat 1200 

ggcacgacac tgccatgacg cagataattg ggaactgtgc ccgtczcczg gccctaagct 1260." 

cagcacctcc tgtcttctct cgaaccctgg ggcccactcg ctrtgacctg ctigggcgact 1320 

ttggacgctt caactggccg ggcaacttct acantgtgtt cctctacaac gcagccttcg 1380 

caggccccac cacactctgc ctggtgaaga ccnccactigc agccgngcgg gcagagccga 1440 

cccgggcctt cgggccggac agaccgccgc zgcccgzczc cggztzcccc caggcatcza 1500 

ggaagaccca gcaccagcga cctccagcng ggggcgggaa ggaaaaaacc ggacaccgcc 1560 

atccgctgcc taggcctgga gggaagccca aggccactrg gaccccagga cctggaatcc 1620 

gagagggcgg gtggcagagg ggagcagagc cacctgcacc accgcataat ctgagccaga 1680 

gtctgggacc aggacctccc gctttcccat acttaaccgc ggcctcagca cggggtaggg 1740 

ccgggcgact gggtctagcc cccgacccca aacccgttra cacatcaacc tgcctcactg 1800 

ctgttctggg ccatccccar agccatgttr acacgattrg acgtgcaaca gggtggggca 1860 

ggggcaggga aaggaciggg ccagggcagg cccgggaqac agaccgnccc ccctgcctct 1920 

ggcccagcag agcctaagca ctg^gccacc ctggaggggc rciggaccac ctgaaagacc 1980 

aaggggatag ggaggaggag gcttcagcca tcagcaacaa agctgacccc agggtctgcc 2040 

ccgttttttt aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2100 

nna 2103 



<210> 41 
<211> 2349 
<212> DNA 

<213> Homo sapiens 
<400> 41 

ccgacccacg cgtccgccea cttctgcgtc tgcacgcacg ggatgcgtct gattgtgtgc 60 
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ctgcggcgta tctctgtgtc tgtatgtatg gggtgtgtgc gactgcgtgt gtgcccggtg 120 

tatccctgca tgtatggggc gcgcctgtgt gcgactgcgt gtgcgcctgg tgtatctctg 180 

tgtctgtatg tatggggcgt gtctgcgtgc gatcgtgtgt ctgtgtccat gtgcgtgtgt 240 

ctttgtgtga ttctttgtgt atacggcaaa cccagaacag aacatcatca cccacctcac 300 

ctggcaaaac aaaaagcttt cagagagatg tgtggccgcc acgatgtctc agctgcaggc 360 

atttttcaaa gttatguctg atgcagctgg cacccttctt cccggggaca aacgcgatct 420 

ttgatcactg agctgagaat gagtttaccc aacaatttta aaatgcctcg gtgcgagagg 480 

atgggctgct agtaaacact tgtggcaaac acgcgttgcc ttttcctcct ccaagcgcac 540 

ggggcaggcc gcctctctgc acgtccatct gcgggccagc ccgtcccgcc acctcccctg 600 

ccccagggct tagtgccctg tgtggaccac ggctgcccca gggacgctgg ggaaggatcc 660 

agcccaacaa acagattcgt cgacgccctc tcaacgaatg tgcgcttcta aaaatgtcrt 720 

taattctgca ctccatttca tacaacaggt gcacatctaa aacatttatt cttaaacagc 780 

tctgtcttct cagcactgca gactatttct cgctcacctg acaggcagac gttgggacca 840 

tgttctttcc ctgccaagaa tccatcagcc ccccaatgcc tgcacctcga gtccagacrc 900 

cttagcgctg cgtcgggacg taattccccg ttagcatccc cagccccttc cccccctgtg 960 

agccagcctg ggttcctttc tcttcccctt cccccccgtt ttccttcttt ctttttcggg 1020 

tttgtaataa gtaacacaca gtcttgcggc taagaaacca aaataataaa gaagatacgg 1080 

cttgaggatt ctcgctctgc actggccctc ccgccaggac ccggtgcgcc ccccaagagg 1140 

gccttgccat gtgcgagctg agacacargt tcacacttcc ctcccttttt gcacaaaagg 1200 

tagcacacgg caaatagcgt ctgcacgcct cccccctcgc tctgtcztcg ggaczactcc 1260 

acatccatga agaagcagca cccttactct cttgcacagt gggcagcggc cacactgtat 1320 

tcactcgtct ccgctgatgg gcactattgt agnggngcaa ctagtaacct agcatacacg 1380 

cccttcagcg cgtgtaccgg tgcatggatc tcacattccc agaggccgag cgaaagaaca 1440 

aaggcttttr. gacagtaccc ctcaggcact cacacacccg ctctgcgagc cacgtgatca 1500 

tgggagcggc tccctgcaga ggtgtccgac gggagaggct gatgtcctLg cctgatggtt 1560 

gctccaacct gagaatggtg agctgctctc ccaacgcaga ccutcatcac ctgcatctca 1620 

tgcgtgagat tgaactgcct ctcattcatg tccccacctc tgtrtgccca tactttagct 1680 

gggttgctaa tcttttaatg ttttctaggc gcctctcata aactagggag tttagctgta 1740 

gtgatgtgag ttgaaagtat ttcttcaaat cgtcattgtc crgaccctcc ntttcttttt 1800 

tcttttgcca cttaaaatat gcagttgaac ctcagcattt gggggcccct ggaccctgag 1860 

tcgtagttga aaaggtcttc ctattccgag gttacacagg aattcaccca Ctcctctccc 1920 

aatattttat tcccatttaa acttttgacc cacttggact gactttgtrg cacggtgtca 1980 

agcacgggtc caagcttgct cttccgacct ggggacccta acaccgttca tcaaaatctic 2040 

catgttggcc gggcgtggtg gctcaagcct gnaatcccag cacttitggga ggctgaggcc 2100 

ggtggatcac aaggtcagga gatcgagacc accccggcta acacggtgaa accctatctc 2160 

tactaaaaat acaacaaatc agccgggcgt ggtggcaggc acccgcag^c ccagctaccc 2220 

aggaggccga ggcaggagaa tcgc::tgaac ccgggaagcg gaggctacag cgagctgaga 2280 

ttgcgccacc gcactccagc ccgggcgaca gagcaagact ctgtcccaca aaaaaaaaaa 2340 

aaaaagggc 2349 



<210> 42 
<211> 1559 
<212> DNA 

<213> Homo sapiens 



<400> 42 

attcatgcca aaacataggc crtcagtgcc tatcacacat ggctrtcagc cccctctact 60 

gagggatgta ggagtttact cctgaggtcc gagccccttt ncccttac::-: cccctactct 120 

ttcctaagcc ttccttataa aaactatgca cgtcctiattg ctcicctttc cgactccctc 180 

tctttcacta cccccagcag gagtgacntg taacccccat atgtcagaaa ggcaggcctc 240 

ctggttgaag aaaagatcca cccaagcaag tcagcagtict: aacaaaccrc cgagggggat 300 

cccaaatgtg ggaaggacrg titatataaga caaccaaacg acgacacgag acaataaacg 360 

ctataggaat catggaggaa caatcagcca uttaccttct tggctaggga agagacatca 420 

ttagttgtag aagcaattac caacttctac atttcttatc gtggaaacca aaaatatata 480 

tatgaaaata aaatgttata accgacctca gtgtcccata aaccagcttc caacaattac 540 

caaactgtga ccaatccttt acacacatgc acaggcgtcc ccccagcatc tgcggggcat 600 

tggttctagg accacttatg gacaccaaca tccacggatg ctcaagtccc tgacataaaa 660 

^ggtggacta catgcatata acccgtguac gtcccgcatt aaccaaaccc cccctagacc 720 
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actcacaaca cgtaacacaa tccaaacgcc acgttaacaa ccgccacact gcattaagcg 
aataaccccg ccaaaaatgc acccctctca gcccagacgc cttctttcct tgrgtgtgga 
acatcccagc ccccaagccc ccctcaaccc cccggcacat aggaggccga ccgcgcgtgt 
gtgcgtgcgt gtgcgtgtgt gtgcgngtgc gtgcgcatac agacacacac atctctgaaa 
tgtaaacatt ctccctttaa aaaaattatt atcacagcta aacaaaccac cagcaattcc 
cttatcccca tatacccggt gtccagactt cctagatcgg cccctaacct ctctacagat 
tatttgaatc tgactcaatc catgcactgc aatgctcgat aacctaagta ccctctatag 
gttcccttca ccacttcttt atcaaactcc ttgcaacttg ttgcactaaa cggacrgcct 
ccctagaatt tcctgcagtc tgaaccatgc ggtaactgtt cacatgttcc agcgtcctct 
tatttcctgt gagttggcag ttagatctag aagcccgatt aaattcagat ctcctctcct 
tagatcacca actttagatc atcaacttgg atcatctgtt tcactctgct ttcgacacgt 
tgttttttag aactacctct taaaattttg atttaacttt ataatcatgt aaaacgttta 
taaatttcca aattcagatc agcaaaacac aacaaaatct atccagagaa ggcaaaaaaa 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa 



780 
840 

900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1559 



<210> 43 

<211> 1766 

<212> DNA 

<213> Homo sapiens 

<400> 43 

cggcacgagg agcaccgaag tacccaccac acgaagtata ctttgcaccg tggacacaaa 60 

ctagaaaaat tgcaagtagc ggcatattgL aaccggcacg cactacacga gcagagtcaa 120 

cgtgtcccct tgtagaacat cctctgatga cacccactac caccccccct ctgctaagct 180 

tcgttctgtg tctgaagggc ataaagcacg gaaactacat ctcccagacc ccatcaccag 240 

aaggatatgg ttggatttca gcaatgagtg ggctt^gcat aaaatttgga agacgaaaga 300 

gaagaaaaac ctggccgctg caggttggaa cactggcaac aacagatacg gagctcgcaa 360 

gaagctgcta agcttcctca ggaaaateat ctgtctcaac atttccggca tatgggacca 420 

tcaattatcg tttccagtgg ttctggctga aaactggtcc aacticttcta tccgagaact 480 

gcccatttct gtgcttcagg aaactaagac catcactggc agttrttgtt: gagggatcct 540 

tgcgcattca tttttcctta aaacagcccc cctaactctc actcccccag cctccacggt 600- 

tgtgtaagtc tttaattctt: agagccacat ctctctcacc cagcataccc cagtgcggct 660 

tctgttttcc agacccaacc ctgactgaca cagtccccat gcgcttcaga tggtgggata 720 

gtttggatat ttgzccczgc ccaaattcca cgtcgaacrc caacccccag tgccgtaggt 780 

ggggcccaat gggaggtgtt cggaccatgg gggcagatcc cccacggcn ggtgctgtcc 840 

tcgtgaragt gagtncttgt: aagacc::ggc cactctaaag cgtrtggcac ccgcaccacc 900 

tcactgcgtc ttgccccngc tcccaccatg cgaagzgccc gczccagczz caccttccac 960 

catgattgca aacttcctga ggcccccctia gaagccaagc agacgccacc accagggcrc 1020 

ctgtaaagca cgcagaactg ccagccaatt atacccctru tccccacaaa ttiaaaaacct 1080 

cttttcttca caaaatggaa agaataaagg cantccttca cagcaacgca agaacggccc 1140 

aacacagacg gctctgccat cagrgagaaa accgagacgc tctctccaga cggcaaaaaa 1200 

gctgcaaaaa caaaaggaaa ctatcaacac accgcctatt gcgatcactt accaagttaa 1260 

gcatattatt aagaagacaa gcataagcct aacacaaccc ggcaatgaat aaaacLgaag 1320 

gagagagagc atatgttggc ctzgczczgr gaaacccaaa tgaaacggtc acctgtccta 1380 

gcagcccatg aaaaattctg cattgtrtac lacgtgccag gaccaacctt aaatrcagct 1440 

ataaaaagtt tgacgatcac aaaaaaaagg gaagcaccaa gcaacatagg cacagagagg 1500 

gaagagcgtc aaacagaatt ctcaacc::gc gtanaaggat acccaagcac ccrc::aagga 1560 

aagcagaaag aagcatgaga aagcctacaa cgacacccac gtcaatacaa caagctggat 1620 

atttagagaa gaaactcctg atcaaacact ctccatgaca cgaacacaca canataatac 1680 

gacatgactg tgtccatgga acacaaagaa acccccccga ccaaagagaa ccggaaaaaa 1740 

aaaaaaaaaa acccgagggg gggccc 1766 



<210> 44 

<211> 2572 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> SITE 
<222> (2527) 

<223> n equals a,t,g, or c 



<400> 44 

aattcggcac gagtccggac cttctcagmt tgctcgatac caggttgtct ttgtagccat 60 

ttcgctccac agccacccgg aacgccggga gcccccgctc atcccgatcc tctccttgta 120 

catgggcgca cttgtgcgcc gcaccaccct gtgcctgggc taccacaaga acattcacga 180 

caccacccct gacagaagtg gcccggagct ggggggagat gcaacaataa gaaagatgcc 24 0 

gagcttctgg tggccttcgg ctctaactct ggccacacag agaatcagtc ggcctattgt 300 

caaccccttt gtttcccggg accttggtgg cagttctgca gccacagagg cagtggcgat 360 

tttgacagcc acataccctg tggtcacatg ccacacggct ggtcgacgga aatccgtgct 420 

gtgtatcctg ctttcgacaa gaacaacccc agcaacaaac tggtgagcac gagcaacaca 480 

gtcacggcag cccacatcaa gaagttcacc cccgtctgca tggctctgtc actcacgctc 540 

cgtttcgtga tgctttggac acccaacgtg tctgagaaaa tcttgataga catcatcgga 600 

gtggactttg cctttgcaga actctgcgtt gtccctttgc ggatcttctc ctccttccca 660 

gttccagtca cagtgagggc gcatcccacc gggtggctga tgacactgaa gaaaaccctc 720 

gtccttgccc ccagccctgt gctgcggacc atcgtcccca tcgccagccc cgtggtccta 780 

ccctacccgg gggtgcacgg cgcgaccccg ggcgcgggct cccccctggc gggctttgtg 840 

ggagaaccca ccatggtcgc caccgccgcg cgccacgtct accggaagca gaaaaagaag 900 

atggagaatg agtcggccac ggagggggaa gaccctgcca cgacagacac gcctccgaca 960 • 

gaggaggtga cagacatcgc ggaaacgaga gaggagaacg aataaggcac gggacgccat i020 

gggcactgca gggacagcca gtcaggacga cacttcggca ccatctcttc cctctcccat 1080 

cgtattttgt tccctttttt ctgttttgtt ctggtaatga aagaggcctt gatttaaagg 1140 

tttcgtgtca actctctagc atactgggca tgctcacact gacgggggga cctagtgaat 1200 

ggtctttact gttgctatgt aaaaacaaac gaaacaactg acttcatacc cctgcctcac 1260 

gaaaacccaa aagacacagc tgcctcacgg ttgacgttgt gtcctcctcc cctggacaat 1320 

ctcctcttgg aaccaaagga ctgcagctgt gccatcgcgc ctcggtcacc ctgcacagca 1380 

ggccacagac tctcctgtcc cccttcatcg ctcttaagaa tcaacaggtt aaaacccggc 1440 

ttcccttgat ttgcttccca gtcacacggc cgtacaaaga gacggagccc cggtggcctc 1500 

ttaaattccc ctcccgccac ggagttcgaa accatctact ccacacatgc aggaggcggg 1560 

tggcacgctg cagcccggag tccccgttca caccgaggaa cggagacctg tgaccacagc 1620 

aggccgacag atggacagaa tctcccgtag aaaggtttgg cttgaaacgc cccgggggca 1680 

gcaaactgac acggttgaat gacagcatcc cactctgcgt tctcctagat ctgagcaagc 1740 

tgtcagttct cacccccacc gtgtacatac atgagctaac tttttcaaat Cgtcacaaaa 1800 

gcgcatctcc agattccaga ccccgccgca ugactttccc tgaaggcttg cttttccctc 1860 

gcccctcctg aaggtcgcat: cagagcgagc cacacggagc aecctaactt tgcatcctag 1920 

cttttacagc gaactgaagc tccaagzctc atccagcaic ccaacgccag gttgctgtag 1980 

ggtaacctict: gaagcagata cactacctgg tcctgctatc cccagccata actctgcggt 2040 

acaggtaatt gagaatgtac tacggcac-r cccccccaca ccatacgata aagcaagaca 2100 

ttttataacg ataccagagt caccatigtgg tccticcccga aacaacgcac ccgaaatcca 2160 

tgcagtgcag tatatttttc taagtctcgg aaagcaggtc tt-tccttca aaaaaattat 2220 

agacacggtt cactaaattg acctagncag aattcctaga ctgaaagaac ccaaacaaaa 2280 

aaatacttta aagatataaa tatacgctgr acacgttacg taacttactt caggctacaa 2340 

tacacttcct atttccgcat tcccaacaaa acgcctccaa cacaacacgg cgattgcttg 2400 

tgtgctcaac atacctgcag tcgaaacgta tcgtiatcaat: gaacatcgca ccttatcggc 2460 

agcagtttra caaagcccgt catccgcact tgaatgcaag gcrcagtaaa tgacagaact 2520 

atttttncat tatgggcaac tgggggaata aatggggtca ctgggagtag gg 2572 



<210> 45 

<211> 526 

<212> DNA 

<213> Komo sapiens 

<220> 

<221> SITE 
<222> (66) 
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<223> n equals a,c,g, or c 

<220> 

<221> SITE 
<222> (106) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (484) 

<223> n equals a,t,g, or c 
<400> 45 

ctctgacagc ctcctcttcg gccaagccct gcccctgcac agccccgagt ggacagccag 60 
aggtcnagac tggagcccag agcccaagat ggagccccag ctgggncccg aggctgccgc 120 
cccccgccct ggctggccgg ccctgctgct gtgggcctca gccctgagct gttctttctc 180 
cctgccagct tcttcccttt cttctctggc gccccaagtc agaaccagct acaattttgg 240 
aaggactttc ctcggtcttg ataaacgcaa tgcctgcatc gggacatcta tttgcaagaa 300 
gttctttaaa gaagaaacaa gacctgacaa ccggctggct tcccaccttg ggactgcccc 360 
ccgattccct tcgstttctc acccttgcaa actaccccar acgaccycca aaacctggsg 420 
sccctgtgga racctttcaa ctggccagca awcwccaaac gaaaccccca aacaggaaat 480 
cttntgcccc ctgcacccac ccccaaagaa cccgcacatt gacgtt 526 



<210> 46 

<211> 1032 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (974) 

<223> n equals a,t,g, or c 



<400> 46 



gtaaaat cgc 
tattgggccc 
ccagagccgc 
gggcgtgcgg 
caccagggtt 
caagtncacg 
ccccaggggc 
ttcttttcgt 
gttgtccagt 
catgtcaccc 
ggagttccga 
tccaaaaaaa 
gcagagacac 
gtggtgacca 
aaggggccac 
ggaccgcgcg 
cgcgcczgtg 
gtcaaggctg 



caaacact ca 
tggccgccgc 
gagcacgcgg 
cgcgcact:gc 
cactgcacag 
gggaggcatc 
agtagggcag 
tgtattcctt 
gtctttgtac 
ctttgaaata 
gaatatgaga 
ttgaaggnga 
aacactccct 
gaactggcag 
atgtggtggc 
aagccagcct 
ggtncagcaa 
ca 



caccgagca t 
gngctctirtg 
agcaacarcg 
gagccccaca 
cacacagacg 

CCttCaZCZg 

caacatcLta 
rgcacagccc 
caaatctctc 
agacctctcg 
acccgrrgga 
ttscagcacg 
ctatrctagg 
tcatgcctta 
gggcaaca ^a 
cccgggcggc 



t: tcttiaaaaa 



caaaacgtgg -gggcggtta cgggtggagc 
cgctgaacgt caLcacgact cgtggacctc 
cggagarctg ctczgzgccz gtaggaaggc 
gcggaacaga tgrgcgcgcc cccacgctgt 
cacaccacag aaaatargcc cttgcaaccc 
gcagcagatc aggaaacgcc cccagccagt 
ggtgcgctirg gcagccccga accttacctc 
ctcagcattt cagcctcrtt: agccggatga 
tcctgcgctg "tggaatgga cgtcacgggc 
ttcagaagct ctggaaaacc tgctagagga 
ccacatagct gtgccr.gccc gatgtacttt 
agctcagtcc cgagtctaat tgactactct 
cccctrggaa gaaaataggc cggtgcrtgg 
cccaaaacag tctggcaata aaataagaaa 
atccccacac tttgagaggc tgaggcaggg 
aaatLcaaaa attagccaga catggtggca 
taagaccgaa ggaccacccg agcccaggaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1032 



<210> 47 
<211> 2680 

<212> DMA 
<2i3> Homo 



sapiens 
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<400> 47 

ggcacgagtc ttacagcaag tactcaacaa actcagactt ccataatcag atttatccca 60 

aaccgcaaaa aaaagccgat ttcattacgt taaaccgaca ggagacctct atgagagatg 120 

aaagtaagcg attttaactt cttaattctt ctaactttcg ctctgtttcc . cacacttgag 180 

gcttttctga aactcaccaa aagagttttg gcagttgtgg ggaatttacc agaacccccc 240 

attattaaaa caactggcct ctcacattaa aatcatcaga caggttttgc gracacagac 300 

atttatgtta agccactgga tagaaccaaa ccgcgtgcag gaacctgttt tctctaagaa 360 

aggactcctt tgagttctgt aggacatgcg cttiaaaggct tcccgtgtca agctatttta 420 

tgaaataaaa ttttacaaac aatgaagcct catttticaca cgttgtattt gcttaatgca 480 

gacttggccc atacgagttt ttgaaatcaa gggtcttttc cgtaaaacta gcttaggagc 540 

ttaatgagcc actgtgtctc tcacccctgt aggatgagca cagagaccac atttgagaag 600 

caggtgtggg tgttttccgc tggccacgtc ccctctgccc ctcctctggc tcagcagtta 660 

cttttcctca gtcatggggg ctgcattcag ccgcgatgcc agcattaggg ccaagctctc 720 

ccgcctctgt gacgctttac tacacgtact cttcatcagt: agtagctgcc attccctgag 780 

tgt-tccttcc accgngcacc gagaagggct gaggatttcc gaacatcaac ctcctctaac 840 

cctctgtatg ggaattagat ttttatcatc ctgaccgccg gggctttcta gagctcaggc 900 

aacttgcccc cgggcgggat ggagcccgcc tggggggagt gtgaatgagg cctctgttgt 960 

ctgtggcttg gctccctgtg gttggaaacc tcagaagcag cagaccccag grgcaacatc 1020 

tggtgcccgg acaagtcgta aaacctgacc tcggggaggg ccttcactgg . cagccagcgc 1080 

ccacccgtga gcccggtgag cacagacgut agagccaggg ccggcgggcn cagaggtcgg 1140 

agccggggac cagtgggccc agaggtcgga gccggggacc agtgggccca gaggccggag 1200 

ccatgtcccc gccgtgaagc tctgagggtt ggtggcgaca ccggggcgat cgcggcaccg 1260 

cgattgtggt gtcttggccg cccttggccg gcgctgccac tgccctggct . tcatgacgta 1320 

ggcaccacat ggccctgcgc cgcctcacct acggctccac ctctggctrt tictctacttg 1380 

cncaccctcg agataaactt ctatttcatt ttcacttcgg tttatatgcg ggtttccttc 1440 

caggtctgac gttaagccta taatattgca atgtgacgtt: tegaagtcaa ggtgtaatag 1500 

agccagtgaa ccaagggttc acaccccagt gaaacacaaa tacccagaat. tgagccactg 1560 

tgttgccata ctgattatgt aatgtgcgat taacaagtat aatgtgccac cttcaacatc 1620 

agtttcatgc caaagttgca ttttatcaga tcatttggga gttcactttg ggcccaaagg 1680 

ctcgtgtcta cataataata acttatgacc ct^ccttttg cctctgttci atitctttgtt 1740 

ttgtgttttt tgctttctag accatgccag agtaatctca gctctctcta gttactggat 1800 

cacacatatc cttcctgaga agagcagcga ctaaaacgga acacctcrtt aagaacagct 1860 

cctctttaac aaaaaaacct aaaagacaaa tgtgagacgg gcrcagacrtr agctctctgg 1920 

gaacctgaaa gacatttacg ccatattatt tattcacgtig rtitgttc -g gcgggcaaga 1980 

tgccatctga ggctccagat gagaaactgg ggr.aaaacgg aaacctt cca ccratttgca 2040 

attatatata ccttgaatta ctacataaaa c::rgattctg tctcrctacc tiattgcaaaa 2100 

attgaaaatg gacattctgt raagtcaaat gtacaqctcg aagcccarai: acttttatga 2160 

agttutgaat caccctgtac cigaaag::cc ccgcrt-aag aatgccttct cggcactaaa 2220 

atgtuccagt tcaagtagtt tgaacacagt tgagttctcc zzctctzczc cactttgcga 2280 

atcatatcag gtacccgtt:: ttcccgtttc garttccttc tictgtgacag aagcagccgt 2340 

cagttctcgg tattaccaag cgttaaaagc atcagtcagg ccgggtgcgg cggcccacgc 2400 

ccgtaatcct ggcacttcgg gagaccgagg caggcggatc acaacgncag gagaccgaga 2460 

gcatcccggc caacacggtg aaaccccacc tctatiLaaaa aracaaaaaa ttiagccgggc 2520 

gtggtggcgg gcgcctgtag tcccagctac tctggaggcc gaggcaagag aatggcatga 2 580 

acctgggagg cagagcttgc agngagctga gaccgcacca ctgcactgca gcctgggcga 2640 

cagagcgaga ccccagctca aaataaaaaa aaaaaaaaaa 2 680 



<210> 48 

<211> 1730 

<212> DNA 

<213> Homo sapiens 



<400> 48 

cccacgcgtc cgggggcctg cggcggagtc caagaggaac caggccgcct ccgggatggt 60 

gtggagcgct gctcctgccc cccgttgcct cctcggggtc ctggggctgg cccaggtgtt 120 

gggggcccaa gccgtgggcc cctggacggc tccagcgcgc ctgggggcag c^icaggccca 180 

gccctgcagg ccctgcaagg agagctctcc gaggtcaccc ccagccccag ccccctcaac 240 
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gactcactga atgagctcca gaccactgng gagggccagg gcgctgatcc ggccgacctg 300 

ggggcaacca aggaccgcat cattcctigag acraacaggc cgcagcagga ggccacagag 360 

catgctacag agagcgaaga gcgctLccga ggcctagagg agggacaagc acaggccggc 420 

cagcgcccca gctcagaggg gcgatcgggc cgccctgagg gngtctgtga acggtcggac 480 

• actgtggctg ggggaccgca gggcccgcgc gagggcctLt ccagacacgc ggctgggctc 540 

tgggctgggc tccgggaaac caacaccacc agccagacgc aggcagcccc gctggagaag 600 

ctggccgggg gacaggcggg cctgggcagg cggccgggrg cccttaacag ctccctgcag 660 

cccccggagg accgtccgca ccagctcagc ctgaaggacc ccactgggcc cgcaggagag 720 

gctgggcccc cagggcctcc tgggcugcag ggacccccag gccctgctgg acctccagga 780 

tcaccaggca aggacgggca agagggcccc accgggccac caggccctca aggtgaacag 840 

ggagcggagg gggcaccagc agcccccgcg ccccaagtgg cattttcagc tgctctgagc 900 

ttgccccggc ctgaaccagg cacggtcccc- cccgacagag Vtcctgctcaa cgatggaggc 960 

tattacgatc cagagacagg cgcgttcaca gccgccactg - gctggacgcc acttgctgag 1020 

cgcggcgctg actgggcacc ggcacgagaa agcggaggcc gtgccgcggc cgctccaacc 1080 

agggcgtggc cggcgtagac tccgggcggc tacgagcctg agggcctgga gaacaagccg 1140 

gcggccgaga gccagcccag cccgggcacc ccgggcgtcc tcagcctcac cctgccgctg 1200 
caggccgggg acacggtccg cgccgacctg gccatggggc agccggcgca ctcggaggag . 1260 

ccgctcacca tcttcagcgg ggccccgccc ratggggacc cagagcttga acacgcgtag 1320 

actggggccc cgcccgacgc gcctacgccg gccgaagaga cagcgggggc ggcgggctcc 1380 

-ggggi^cccg cctgagacgg ggcacctagc cctgggcgag cgccgcaccc gggcccgcag 1440 

cggcaccgcg cccagagcgg cctcccccca cgcccggggc gcgccggctic agggaggccc 1500 

ggggccgccc acgcagaccc ctggcccggc gcgacccccc aagaacccct ccagggccgg 1560 

cctgcggagg agccgatcct cgcacccccc gcccccccca ccggccctcc aggtcgattc 1620 

cctgggcccc aggctccccc gcgcgggcgc cgcccaccgc catiactaaac gatcgaggaa 168.0 

taaagacact cggtttttcc aaaaaaaaaa aaaaaaaaaa aaggggccgg 1730 



<210> 49 
<211> 1275 

<212> DNA 

<213> Homo sapiens 



<400> 49 

ggcacgagcc agctcctctt ccaagacgcc ccccctaaac -trgccagtg aaatggagaa .60. 

gtactgtttg gggaataata cgcccccaag gccttgccta ctucgataa tgctcctgca 120. 

catccttctc tctttggcta cccrcaccca acgccacact gctgcctcrc tccccaagca 180 

ccacccctct gttccaacta atggaicaaa gagccatagc agctcctaag gcccgatcac 240 

cacctgagga ggacctccgc ctaggccarc gccgcaaaca agaaagaccc tccctaaaca 300 

cactggtgca attccggaag cartacaaca -ccacggcag caccaacaac cctggctaag 360 

-Cigggggtga agccaccaag gagtgtgcaa acgccacccg ggaagagaca CLgaagacat: 420 

tcgtccatga cCLcaaaaca ccgccaagga cgaggacctt gcaaaaaaaa aatactcaca 480 

aggctttagz tgaaacggca aacaaczcta agccggacgt ggaggaggac gacattgagg 540 

aggccacaga tgtggtcact: gaggaaccga ctaatgacga g-caccggaa ccagaacaga 600 

aatgcatagc aagaggaaag gaaaccacag gagaaaaaaa gaaaagcccc caaggacatc 660 

cacagcgacg ggcttaggca gaagcztctg cagacctcaa caaacccccc aaaacgtttg 720 

aaaatgctga ctccaacatg aaacgatttt catragcaga gaggaatgct catcgcgcac 780 

catccaccca caagcaaacc tacgacgaac aaaagaaaca aaccaaacaa accaccacga 840 

acacacctct gaaaagagtg acatctcaag aagagcccca ggcaggccct ccaggaggta 900 

tcccagaaga agaaggcacc gticaccacag gacacgacag ccccctgggc gctatcgccc 960 

tcaaaggcct tccagcagga caaggcgtgg agacggaagg rgccgatttt cacgaccctig 1020 

acccuacaca ggtctaggcc aatgggtaar ggtgcgtccg cg-ctragcc tttaaagaaa 1080 

aatgcaaaaa gtaaaaaata aaaataacac. caatacaggc aaaagtccac agaacaagga 1140 

tataaggaga gaaaaaaatt azctzcczcz gcat:"~gcgc iccaagccgt gttaccacaa 120O 

aagaacaaaa aagticaagag aaactgcctt caaagcaaaa cagtcacaac aacczaaaaa 1260 

aaaaaaaaaa aaaaa 1275 



<210> 50 
<211> 1762 
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<2i2> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (447) 

<223> n equals a, eg, or c 

<400> 50 

ggtggcctag agacgctgct gccgcggttg cagttgtcgc gcacgcctct gcccgccagc 60 

ccgctccacc gccgtagcgc ccgag-gccg gggggcgcac ccgagtcggg ccacgaggcc 120 

gggaaccgcg ctacaggccg cgctgccggc cgtgccgccg gtggggctgc gggccgcgac 180 

gggtcgcctg ctgagtgggc agccag::c"g ccggggaggg acacagaggc cttgttataa 240 

agtcatttac ctccatgaca cttctcgaag actgaacctt gaggaagcca aagaagcctg 300 

caggagggat ggaggccagc tagccagcat cgagtctgaa gacgaacaga aactgacaga 360 

aaakttcact gaaaacccct tgccacccga rggcgacttc tggattgggc tcaggaggcg 420 

tgaggagaaa caaagcaaca gcacacnctg ccaggacctt tatgcttgga ctgatggcag 480 

catatcacaa tttaggaact ggtatgcgga cgagccgtcc cgcggcagcg aggtccgcgt 540 

ggtcatgtac caccagccat: cggcacccgc cggcaccgga ggcccctaca tgttccagcg 600 

gaatgatgac cggtgcaaca tgaagaacaa cttcatccgc aaatactctg acgagaaacc 660 

agcagttccc tctagagaag ccgaacg-ga ggaaacagag ctgacaacac ctgtacttcc 720 

agaagaaaca caggaagaag atgccaaaaa aacaccraaa gaaagcagag aagccgcctt 780 

gaaccuggcc tacatcccaa tccccagcdL icccccuctc czcczccztg nggticaccac 340 

agttgtacgt cgggtttgga cctgtagaaa aagaaaacgg gagcagccag accctagcac 900 

aaagaagcaa cacaccatct ggcccrcccc ccaccaggga aacagcccgg acctagaggt 960 

ctacaatgtc ataagaaaac aaagcgaagc tgacttagcc gagacccggc cagacctgaa 1020 

gaatatttca ttccgagtgt gctcgggaga agccaccccc gacgacacgt cttgtgacta 1080 

tgacaacatg gctgtgaacc catcagaaag tgggtttgtg acrcrggcga gcgtggagag 1140 

tggatttgtg accaatgaca cttatgagtc ccccccagac caaatgggga ggagtaagga 1200 

gtctggatgg gtggaaaatg aaacatacgg ctattaggac acacaaaaaa ctgaaactga 1260 

caacaatgga aaagaaatga taagcaaaat cctcttatct tctataagga aaatacacag 1320 

aaggtctacg aacaagctta gatcaggtcc tgcggatgag cacgcggtcc ccacgacctc 1380 

ctgttggacc cccacgtttc ggctgtaccc tctaccccag ccagtcaccc agctcgacct 1440 

tatgagaagg caccttgccc aggtctggca cacagcagag cctcaataaa tgtcacttgg 1500 

ttggttgtat ctaactttta agggacagag cttcacccgg cagcga^aaa gacgggctgt 1560 

ggagcttgga aaaccacctc tgtcccccrt cctccataca gcagcacata ttatcataca 1620 

gacagaaaac ccagaatcct: t!:caaagccc acatacggra ccacaggycg gcctgcgcat 1680 

cggcaattct catatctgcc ctttccaaag aataaaacca aaiaaagagc aggaaaaaaa 1740 

aaaaaaaaaa aaaaaacccg ag 1762 



<210> 51 
<211> 2059 
<212> DNA 

<213> Homo sapiens 



<400> 51 

cggcacgagg tttggtccaa gagcgagcag aaccgtgctig agazgggagc acatctggtc 60 

gtgttcaaca cagaagcaga gcagcggaag gaggcatgca cccccagctg acaccntcgg 120 

tcatcgctgt agttcccatc ztacztczcc gtgtcrgr::t caccgcaagt tgttcggcga 180 

ctcaccacaa cctttcacgc tgcaagagag gcacaggagt gcacaagtna gagcaccatg 240 

caaagctcaa a::gcatcaaa gagaaaccag aactgaaaag c::gctgaagg gagcacctgg 300 

aactgttgtic ctattgaccg gagaaccttc cagtcccaac cgctactctc ctcczzaczg 360 

acaacaagac gtgggctgag agctgaaagg aactgttcag ggacgggggc ccatctgatg 420 

accatcagca cggaagctga gcagaacttr actaiccagt ncccggatag acggccttcc 480 

tatttccttg gacttagaga tgagaacgcc aaaggtcagt ggcgttgggr gggaccagac 540 

gccatrcaac ccacgccaga gcattctggc ataagaatga acccgacaac tytcagggag 600 

aaaactgcgt tgrtctggtc ttacaaccaa gacaaatggg cccggaatga ugctccttgn 660 

aactttgaag caagtaggat ctgtaaaaca cccggaacaa caccgaacca gaaactcaga 720 
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aagtggtcct cgcgacggaa agagaaaaga aaaaccaact agaacaaggc agaacgcacg 
tgcgccaccg gaacacagaa aacacgccgg cccatacagc gtttctagcc acaacggtct 
ttttcacttt gtttgatcca ctcgagacaa catgcgtgca tgcgcgcgcg tgtgcgcgta 
gataatgcgg cttttgcatg gtgtctgatg gaaggaataa cctttctttg ctttcttagc 
agtatttcaa ggcgtctact ttrcaatcgg tgcgcaccga atgcacgtac ggaagaacag 
cgcgaataat gcaacccctc tgccacttcc tcccctttcc cagaccctta gctcttaaaa 
ttcaaaagat gggatactcc aaccggtagc ggcgcatcat titttaaccca aatactgcaa 
gcactttaaa gatccgaaac cacactttca tcgctcgacg tttcattttc agacttttta 
atgucagtca ttacaattac atcgcatgag gaaaactctc ccagaacaac agtgtggaat 
agttctgaat tatgctgtcc cacagatgga aaaaagtcca aatgcctttia aaaatctact 
tcttactcca cccaacacgt ttccgcaaag caagaagtct ttgtaagaca ccttaaacaa 
agtccttcaa tcctacagca gaggaaacaa aatcccccag aagccaaagg gctcaccttc 
acattgttag tccatgacag acccaggcgt gctccatcag aganaacata cattcccttt 
ggtatcacag gaagttactg gggactaccc gaccncatca cttagctaac gactggacaa 
aatttcctaa ccgcttgaa.g caacattgta ctcgtgtttg cactactaac ttgaatagaa 
aataatcaca tttitcaaccc atttatacaa actgtitaacg cttccctaga gctgtacaac 
tatagtttga actagcaagg aagtcattgc cttgacaacc agaaattatg cttttctggt 
gcatgaaaca ctaactgcaa agggcagtca catccaacrt taacaaaata tggtggtctt 
tctcaaaatt ttcaacctgc caattttrcc cggaaaccta accttcctac cactaccttt 
aaaagcagta cntattccgt aaatragcta ctgatrtcrt gcgtttgaaa tactggtgtt 
tgatcgtgcg gcggcggngg cggtgacgga cgcgcgragc cgcgccaaaa gctgacactt 
ctagctgaaa cggugccaaa taaarttgac cgcgccrgcc azazzcacza aaaaaaaaaa 
aaaaaaacta gaggggggg 

<210> 52 
<211> 3282 

<212> DNA 

<213> Homo sapiens 



<400> 52 

cccacgcgtc cgacttaaaa gagaagctct agccgccaaa gatcgggaaa gggaaaggac 60 

aaaaaagacc cctgggctac acggcgcagg cgcagggttc cctactgctg trcctttatg 120 ' 

ctgggagctg tggctgtaac caaccaggaa acaacgtacg cagcagctac ggctgtcaga 180 

gagttgtgcc tcccaagaca aaggcaagtc ctgtttcttt ccccittttg gggagtgccc 240 

ttggcaggct ctggguttgg acgttactcg g-gaccgagg aaacagagaa aggacccttt 300 

gtggccaacc tggcaaagga tctgggacta gcagaggggg agccggctigc aaggggaacc 360 

^ggg-ggtet ccgatgacaa caaacaacac czgczcctgg acccacacac cgggaacrtg 420 

ctcacaaatg agaaacngga ccgagagaag c-grg-ggcc ccaaagagcc ccgtacgctg 480 

catccccaaa ctctaatgga tgacccctcc cagatrcacc ggcccgagcr gagagccagg 540 

gatataaacg accacgcgcc agtaccccag gacaaagaaa cacccccaaa aatatcagaa 600 

aatacagctg aagggacagc acccagacca gaaagagcac aggacccaga cggaggacct 660 

aacggratcc aaaactacac gatcagcccc aactcr.tccr zccacactaa catcagcggc 720 

ggtgatgaag gcatgataca tccagagcta gtgtcggaca aagcaccgga tcgggaggag 780 

cagggagagc tcagcccaac ccccacagcg ccggacggtg gg-ctccatc caggtccggg 840 

acctctactg cacgcaccgt cgtccrggac gtcaacgaca acgccccaca gcccgcccag 900 

gctccgtacg agacccaggc cccagaaaac agccccancg ggzzccztat tgtcaaggca 960 

t^gggcagaag acgtagacrc tggagrcaac gcggaagcar: cctacccact rcttgatgcc 1020 

tcagaaaaca ctcgaacaac cctccaaacc aarccrctct ccggggaaac ccttctcaga 1080 

gaattgctcg accatgagct agcaaacccc cacaaaacaa acatacaggc aacggacggt 1140 

ggaggccttt ctgcaagacg cagggtttira gcggaagcac tggacaccaa tgacaacccc 1200 

cctgaactga tcgcatcacc azzzzzcaac ticcgticgccg agaacrctcc cgagacgccg 1260 

ctggctgttt tcaagatcaa tgacagagac ::crggagaaa acggaaagac ggctcgctac 1320 

atccaagaga atctgccatc cccaccaaaa cctcctgtgg agaaccccca caccccaatc 1380 

acagaaggcg cgccggacag agagaccaga gccgagcaca acaccactac caccgtcact 1440 

gacttgggga cacccaggct gaaaaccgag cacaacataa ccgcgccgg:: ccccgacgtc 1500 

aataacaacg cccccgcczz cacccaaacc tcccacaccc tgctcgcccg cgagaacaac 1560 

agccccgccc tgcacatcgg cagcgccagc gccacagaca gagactcggg caccaacgcc 1620 

caggtcaccc acccgctgrt gccgccccaa gacccgcacc tgcccctcgc ccccctggcc 1680 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2059 
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cccatcaacg cggacaacgg ccacctgccc gccctcaggt cgctggacta cgaggccccg 1740 

caggctittcg agttccgcgc gggcgccaca gaccgcggct cccccgcgct gaacagcgag 1800 

gcgctgggtg cgcgtgctgg tgccggacgc caacgacaac tcgccctccg cgctgtaccc 1860 

gctgcagaac ggccccgcgc cctgcaccga gccggtgccc cgggcggccg agccgggcta 1920 

cctggtgacc aaggtggcgg cggtggacgg cgactcgggc cagaacgcct ggctgtcgta 1980 

ccagccgctc aaggccacgg agcccgggct gttcggtgtg tgggcgcaca acggggaggt 2040 

gcgcaccgcc aggccgctga gcgagcgcga cgcagccaag cacaggctgg tggtgcttgt 2100 

caaggacaac ggcgagcctc ctcgctcggc caccgccacg ctgcacgtgc tcctggtgga 2160 

cggcttctcc cagccctacc cgcctccccc ggaggcggcc ccggcccagg cccaggccga 2220 

cttgctcacc gcctacccgg tggcggcgtt ggccccggtg tcttcgctct tcctcctctc 2280 

ggtgctcctg ttcgtggcgg tgcggctgtg caggaggagc agggcggcct cggtgggtcg 2340 

ctgcccggtg cccgagggtc ctcttccagg gcatctggtg gacgcgaggg gcgctgagac 2400 

cctgccccag agctaccagt acgaggcgtg tctgacggga ggccccggga ccagtgagtt 2460 

caagttcttg aaaccagtca tttcggatac tcaggcacag ggccctggga ggaagggcga 2520 

agaaaattcc accttccgaa atagctttgg atttaatact cagtaaagtc cgtttttagt 2580 

ttcatatact tttggtgtgt tacatagcca tgt::tccact agtttacttt taaatcccaa 2640 

atttaagtta ttatgcaact tcaagcatta ttttcaagta gtatacccct gtggttttac 2700 

aatgttccat cattcttttg caccaataac aactgggttt aatttaatga gcattttttt 2760 

ctaaatgata gtgttaaggt cttaattctc cccaactgcc caaggaatca attactatta 2 820 

tatctcatta cagaaacctg aggttttgac ccactrcaga gcctgcatct cacgattcta 2880 

accacctctg tctatagcgt acttgcccra tctaagaagg catacctaca tttccaaact 2940 

cattctaaca ttctatatat ncgtgtctga aaaccacgtc acttacttct acatcacgca 3000 

ctcaaaaaga aatacttctc caudaccauy ctcatgacaa aacgaaacaa agcatactgt 3060 

gagcaatact gaacaccaat aataccctta gcteatatac tcactatctc aLccttaagc 3120 

atgccacttc tacttggcca acattctctt atgttaactr ttgctgatgc acaaaacaga 3180 

ctatgcctta taattgaaat aaaattataa tctgcccgaa aacgaacaaa aataaaacat 3240 

ttcgaaactt gcgaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 3282 



<210> 53 
<211> 1860 
<212> DNA 

<213> Homo sapiens 



<400> 53 

aattcggcac gagcagcagg gagaagagga tgatgagcat gccaggcccc cggccgagtc 60 

cccgctcctg gccattgctg acctgczczz ctgcccggac rtcacggtcc agagccaccg 120 

gaggagcact gtggactcgg cagaggacgc ccactccccg gacagctgrg aacacacccg 180 

ggaggctggt gtgggctccg cccactcccc ccagcccaac 'acatccacg atatgaaccg 240 

gatggagccg ctgaaaccgc tgccgacacg crcccccgag gccatgcacc rgcccccagc 300 

tccggaaagt ggcagcacca acccacgggc -cagctctct tgtzccacgg agaacagaca 3 60 

cgccccgccc ctcttcacct ccccccccaa caccgtgtgt gcctatgacc ctgtgggcta 420 

cgggaccccc tacaaccacc tgcccttctc tgac-accgg gaacccctgg cggaggaggc 480 

tgcccaggtg ctcattgtca ccttggacca cgacagtgcc agcagcgcca gccccactgc 540 

ggacggcacc accactggca ccgccacgga tgacgccgat cctccaggcc ctgagaacct 600 

gtttgcgaac tacctgtccc gcatccatcg tgaggaggac. tcccagtcca cccccaaggg 660 

tacagcccgg ctgctgtcca accccczgct ccagacccac cngcctaacc ccaccaagaa 720 

agatccagtc ccaccaggag ctgccagttc tctcctggaa gcrctgcgac cccaacaaga 780 

aattcccctt ccccgtgctg aagagcagcg acgccctaga catcctcgtc cccatcctct 840 

tcttcctcaa cgatgcccgg gccgaccagt gtaagaccag ggtgggggtg ggacgctggg 900 

ggctcccctg gcggcgaggt ggagggccgg ccatccgccc cgacccccag gagctcagcc 960 

tggcactctg acctcctagg* agggaggcct ccaaaatggt cactgcctgg gcccctcctt 1020 

ccccctccca cccgagtgac ccctcccacg ccacccgccg cagctcgggc gggcctgatg 1080 

cacatcggtg tcctcatctt gctgcttczg agcggggagc ggaacttcgg ggcgcggccg 1140 

aacaaaccct actcaacccg cgtgcccatg gacatcccag ccttcacagg gacccacgcc 1200 

gacctgctca ttgtggcgtt ccacaagatc accaccagcg ggcaccagcg gttgcagccc 12 60 

ctcttcgact gcctgctcac caccgcggtc aacggtaggg gcccggagct tgcactcccc 1320 

gcgctcacca ccaggtggtg tcagccacag gccaacctgc cacctgctcc aggtttcaag 13 80 

cccctccagc ccacagaaag aacgcaagcc cgtgcaccct tcatagcagc cacattaaaa 1440 
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aaaaagggtg acaacaatcg aacaatacgt ttcatccacc accaccccaa cacgcagcca 1500 

acacgaaaac taccaatgag atatccaata ttccctzczc acacccagtc ctgccgccca 1560 

ggccggagtg cagtggcacc accccagccc actgcaacct: ccgccccctg ggttcaagca 1620 

actctcatgc cccagcctcc cgagnaggtg ggatcacagg cgtgcgccac cacacctggc 1680 

taacttttgt atctttagca gagacggggt tttgccatgt cgcccaggct ggcctcgaac 1740 

tcctgacctc agatgatatg cccaccrcag ccccccaaag tgccgggatt acaggcatga 1800 

gccactgcat ctggcctcct caaaaaaaaa aaaaaaaaaa acccgagggg gggcccggta 1860 



<210> 54 
<211> 770 
<212> DNA 

<213> Homo sapiens 



<400> 54 

aattcggcac gagccgaggg gcgaacccac acatccgtcg tgtggacccg ccctgacatc 60 

ccctcaacgc ttagactcgg catcctcctc attgcctgat gccgtttcat gccctctttg 120 

cgtttcctcg ctccggccct cctgctiggcc atccttcccg cacctcccaa cgcccacgcc 180 

gcgccgggga ttggcggccc gatcggcggc ggcccccagg cctcagccaa ggaagagccc 240 

cagagcaacg cgcaacccag cgccgatgag cgcaaacaac gactgczcag ccaggccgag 300 

gaaacccggc agcgccngac cgarcrcaag gccgaacccg ccggcgcacc caaggagacc 3 60 

agcgaggccc agcgtacgcc ccccaaactg g-gagcgagg acaacagcga tccgcccgag 420 

cgcctctcca agctctcggt gccggtactg gagcaacgcc cggcggcccg cgtggacgag 480 

ctcgcccLct ggcaacaagc gctcagcgcg gccaacagca tgctcaccag cgcgcagacc- 540 

cggcccgagc gcgcccaggc cgatatcagc aagaaccagn cgcgcatcga cgagatcaac 600 

ggcctgctga aaagcggtcg ggagaacaac aagccgccga cggacgaacg tcgcgcgctg 660; 

ctcgagagta cttctagagc ggccgcgggc ccaccgacct tccacccggg tggggtacca 720... 

ggtaagcgta cccaattcgc cctacagtga gtcgcattac aacccaccgg 770 



<210> 55 
<211> 1093 
<212> DNA 

<213> Homo sapiens 



<400> 55 

cagattcggc acgaggttct cr.tcaagact cattcitcaa aaattacctc tgcggactcg 60 

ggtaccaacg atggcaccca cttttgggaa cctgcagccg tgcccrgaga aciigccaccg 120 

gtcatgtgcc gcaccgctcr ctgcacg-ct acgcccctcg gaccggctcc ccccaggatt 180 

ctcttccgrt tttgtctttt tgatrcgggc tztatuttic" icrgcgtact gcaccatact 240 

gtaaaaggga tttcagcaga gacttragcc ttcggggcaa gaggagaaca ggaacgccgg 300 

gctgtctacc ttaggcggag aacccaccit cagacc-ttg gaccacctcc tcccaactgc 360 

agtgtataga aaaaccaaac tacgacctca gagcagagta -taacgaaaa gcacaaaaaa 420 

aggaactaag ctcagcgagg ggcgggggga ggggggagac tttrcctttg aaaaataatg 480 

actcctagga catttgttt" tcagttcaag tgccctrcag cactgtctcg tcccccaata 540 

taccaaccca ctggcacatt: tttctctcgc tctctccctc cgactitrgct ctgtctcccc 600 

agttaagtgt ctccctccct zgtgcccccc gctggcgacc czczgctzcc ctctctcctt 660 

cccttitggca gctgcaacac acag::gtcaC tttggggaaa taaatctagc aaagcctcgc 720 

cttccacgcc gagcgccccc ttggccctga gagggaaagg cccgtccttg ggacgctccc 730 

tggtctt::tt cccccccaag tcttcctctr tcccatcaca cccctccccg cccaccccgt 840 

tctccgctcc cctcttatca ggaattccca agtgaaccrc atcaacgcgg gagcggaaca 900 

gacgctaaaa gcta^iccagg attttgcttc tgtttgctct aaatctcgtg gzzccztccc 960 

tttcctcccc cccccatgcg taagacgttc tgrgcaacct: ccaccaaatc tggracaaaa 1020 

ccactcgcca gagctgtggt gtcagaaaaa caaaatatac tg::tcctcam aaaaaaaaaa 1080 

aaaaaaaact cga 1093 



<210> 56 
<211> 632 
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<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (29) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (46) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (94) 

<223> n equals a.t.g, or c 

<220> 

<221> SITE 
<222> (162) 

<223> n equals a,t,g, or c 



<400> 56 . 

cgcaattaat gtgagtcagc ccactcatna ggcaccccag gctt~ncacc ttatgctttc 60 

cggctcgtat gttgtgtgga actgtgacgg atancaactt: cacacaggaa acagctatgc 120 

catgatcacg ccaagctcga aattaaccct cactaaaggg ancaaaagcc ggagctccac 180 

egcggtggcg gccgctccag aaccagtgga tcccccgggc cgcaggaatt cggcacgaga 240 

ctatgtatat atgtttaaca tctgtctttt gaaatgcaga aacagtztaa atgtttcttt 300 

gtctattttt cttttttttt aatgctaccc agggaaacar ttrcatatca tttttaagtg 360 

[ gcctgcctca atgtatattt atttcttttg aaacaaaaag gttccggaaa ctgtttccct 420 

gtagcttcaa atgaataggt gagcaaaacc tatatgggat: gtaatctttc cgttcagtct 480 

cttaaaaaat actttgtttt ggtacatttg gctgtgctcg tggggaaaac aaaaacgcag 540 

agacccttat atacttacgt taaagcaaca cctcattacc tacataaaac agaaatgcac 600 

, aataaaaaaa aaaaaaaaaa gctcgagggg gg 632 



<210> 57 
<211> 2687 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (1614) 

<223> n equals a,t,g, or c 



<400> 57 

gtacaccatg ggcctccacc tccgccccta ccgtgtgggg ctgctcccgg arggcctcct 60 

gttcctctcg ccgctgctaa tgctgctcgc ggacccagcg czcccggccg gacgccaccc 120 

cccagtggtg ctggtccccg gtgacncggg taaccaaccg gaagccaagc cggacaagcc 180 

gacagcggtg cactacccct gccccaagaa gaccgaaagc caccrcacaa cccggctgaa 240 

cctggaaccg ctgctgcctg tgcatcactg accgctggac tgacaacatc aggctggttt 300 

acaacaaaac atccagggcc acccagtctc ctgacggtgc ggatgcacgt gcccctggct 360 

ttgggaagac cttctcactg gagttcccgg accccagcaa aagcagcgtg ggttcccatc 420 

tccacaccat ggtggagagc ctcgcgggct ggggctacac acggggcgag gatgtccgag 480 

gggctcccta tgactggcgc cgagccccaa atgaaaacgg gccctacctc ccggccctcc 540 

gcgagatgat cgaggagacg taccagctgt acgggggccc cgtggtgccg gccgcccaca 600 

gtatgggcaa catgtacacg ctctactttc tgcagcggca gccgcaggcc cggaargaca 660 
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agcatatccg ggcccccgtg tcaccgggcg cgccccgggg gggcgcggcc aagaccctgc 
gcgtcctggc cccaggagac aacaaccgga tcccagccat cgggcccctg aagatccggg 
agcagcagcg gccagccgcc tccaccagct ggccgctgcc ctacaactac acatggtcac 
ctgagaaggt gcccgtgcag acacccacaa tcaactacac actgcgggac caccgcaagt 
tcctccagga caccggcttt gaagatggct ggctcacgcg gcaggacaca gaagggctgg 
tggaagccac gacgccaccc ggcgtgcagc cgcactgcct ctiatggcact ggcgccccca 
caccagactc cccccactat gagagctccc ccgaccgtga ccctiaaaacc cgctttggtg 
acggcgacgg cactgcgaac tcgaagagcg ccctgcagcg ccaggcctgg cagagccgcc 
aggagcacca agtgttgccg caggagccgc caggcagcga gcacaccgag acgccggcca 
acgccaccac cccggcccat ctgaaacgcg tgccccttgg gccctgactc ccgtgccaca 
ggactcctgt ggctcggccg cggacctgct gttggcctct ggggctgtca tggcccacgc 
gttttgcaaa gtttgtgacc caccattcaa ggccccgagt cttggactgt gaagcatctg 
ccatggggaa gtgccgtucg ttacccctcc tctgtggcag cgaagaagga agaaacgaga 
gtctagaccc aagggacact ggacggcaag aatgccgctg atggtggaac tgctgtracc 
tcaggactgg ctccacaggg tggactggct gggccctggt cccagtccct gcctggggcc 
atgtgtcccc cctattcccg tgggcctttc atactcgcct actgggccct ggcncsgcag 
ccttcctatg agggatgtca ctgggccgtg gtcctgtacc cagaggtccc agggatcggc 
tcctggcccc tcgggtgacc ctccccacac accagccaca gataggcctg ccactggtca 
tgggtagcta gagccgctgg cttccccgtg gcttagccgg cggccagcct gactggcttc 
ctgggcgagc ctagcagccc ctgcaggcag gggcagtccg ttgcgctctt cgcggtcccc 
aggccctggg acatctcacc ccactcctac ctcccttacc accaggagca ctcaagctcc 
ggattgggca gcagatgtgc ccacagtccc gcagccgtgt nccaggggcc ccgacttccc 
cggacgtgct attggcccca ggaccgaagc zgcctcccct caccccggga ctgtggctcc 
aaggacgaga gcaggggccg gagccacggc cctccgggaa cccacggaga aagggaatcc 
aaggaagcag ccaaggctgc tcgcagcttc cccgagccgc accccttgct aaccccacca 
tcacactgcc accctgccct agggtcccac cagtaccaag tgggccagca cagggctgag 
gatggggctc ctatccaccc tggccagcac ccagcccagt gccgggacra gcccagaaac 
ttgaatggga ccctgagaga gccaggggtc ccctgaggcc cccccagggg ctttctgtcc 
gccccagggt gctccatgga tctccctgtg gcagcaggca tggagagtca gggctgcctc 
catggcagta ggctccaagt gggtgaccgg ccacaggccg agaaaagggt acagcctcca 
ggtggggtcc ccaaagacgc cttcasgcrg gactgagccg ctctcccaca gggttcctgt 
gcagctggat ttcccctgtt gcatacatgc ccggcacctg rccccccttg tccctgagng 
gccccacatg gggccctgag caggctgtac ccggatcctg gcaataaaag ractccggat 
gctgtaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaag ggcggcc 



<210> 58 
<211> 619 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (526) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (619) 

<223> n equals a,t,g, or c 



<400> 58 

ccaccagctt ccacccggar catcgtgaca gcctcccact gcztczczac cazgzggcca 60 

gagctatctc cctaaaatgc accgcacagt tgatcaagtc actctccggc ccaaaacccc 120 

cctcggctcc ctgctgccct caggacaaag tccggacccc ccagcacggc ccgcgagacc 180 

catggtgtcc trgtccctgc zcacccczct ggcctcatca ctcgccttct: tgcattctgg 240 

gtcccagcct cctgtaccca gagatgcagt ggctccccat tgccaccccg attcctccct: 300 

tcttttggtc acagagaaag ggtacccrct ctgtcaaatc ccaactcaga cccgacctcc 360 

tccaaggagc cccggccata ccctcccccc ccgaccccca ccccggcaca ccacacagat 420 
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cactccgggc ccacttgccc gcctaatggc catctcccca gtagactgta agctccttga 480 

gggcaaggat cgcgctggaa tttttgtatc aacagtgcct ggcttngtgc ccggcaccta 540 

gaaagcactc aataaatgtt tgcttaatga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 600 

aaaaaaaaaa aaaaaaaan 619 



<210> 59 

<211> 1378 

<212> DNA 

<213> Homo sapiens 



<400> 59 

tcgacttctg 

ttattttttt 

cttcctgccc 

caattacagg 

aaatacaacc 

gattaggttc 

gtgcagcttc 

tccatgcttt 

ttctcctcic 

cagcgtcctc 

caccatmcaa 

ctcccctagc 

tgtggttttc 

gttgtatttt 

ccttgatgct 

agggaatttc 

cttcaaatac 

gttcctctaa 

tattcatctg 

ctacaggtaa 

taatctctac 

aaattacagg 

tcccgggtcg 



ggacgtagat 
cacactgtat 
aggattggaa 
ctcggcagaa 
gtcctctaca 
atactttggt 
attaaggggt 
tactctccca 
cctctgacra 
cctcttcagc 
aaaccccccc 
tggaggtggc 
tccagtttct 
agattcaaat 
ctcttcatag 
cgcaaaaaaa 
cacatttgcc 
atctgttaat 
cagtcggatg 
ttttcggaca 
aggatcatgg 
ctctactaac 
acgagctcac 



tcccctttac 
tctacactgc 
tcaattacag 
gagttttgtg 
gttgaacatc 
tcttccaaac 
acaaaaataa 
cgccgctttt 
aggcgtttat 
ttttcaagca 
ccactcagac 
ggatatacca 
tcttcttcat 
gatacagttt 
gttcttgtac 
ttggagatgt 
ggaggatcaa 
ttatttaccc 
ctcagctttt 
tcacaccatt 
tctggttcgt 
agccggaaga 
cagtcggcgg 



tgaatccttc 
atatgcttct 
ctatgcaatg 
aacaactctc 
cctgtatgac 
cggggaaaca 
aacccctctc 
gctgcattga 
agaaaaaaga 
ccaagtatct 
ag ugcayy 

tcaactcttc 
caaaaaactg 
caattttgtt 
aggcaacaag 

tactctttat 
ggccattttg 
ctttacaatc 
gataaactgc 
cacatttagt 
ctagctgtat 
gcgttcctac 
ccgctccaga 



atcctcactc 
cttacaacga 
ttttttgttt 
catttcaccg 
agcattttca 
aacaacagcc 
aaaaatatct 
cagattagtc 
accgaacata 
caaataaaag 

C tea t y ta 

tactgggcca 
aaattcttgt 
ttccttttgc 
tctgccatta 

accaatttca 
aaaaatcaaa 
attccaaact 
tggaattgct 
tagatcagag 
cttgataaaa 
ccgtcgacgc 
ggatccctcg 



gcagtatgat 
gggtcacttt 
agcatcttct 
gctgaagaag 
tggtaatgag 
aaaaaccagt 
acgcgccggg 
gcttcatgat 
tgaattcggt 
tctataataa 



atgaagatgg 
tcgcttctaa 
ccacaacttc 
gcttcttcaa 
ccaatgacac 
attatatatt 
ttatcctcct 
tgaagaaaca 
gtacttaata 
tctaaacaaa 
ggccgcgaat 
aggggccc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 

cm 

720 
780 
840 
900 

960 
1020 
1080 
1140 
1200 
1260 
1320 
1378 



<210> 60 

<211> 1126 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (21) 

<223> n equals a,t,g. or c 

<220> 

<221> SITE 
<222> (35) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (49) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
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<222> (99) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (1012) 

<223> n equals a,c,g, or c 



; <400> 60 

tccggcttcg taatgcggcg ngaaactgta acggnataac aatttcacnc aggaaccagc 60 

tatgcccacg attacgccaa gttcgaaatt aaccctcant aaaggaccaa aagcgaagtt 120 

ccaccgcggt gccggcccgc tccagaaacc agtggatccc ccgggctgca ggaattcggc 180 

acgagccggc ccgtaccgcc aggcgatcgc gctgatggcg gcgctggcag cagcggccaa 240 

t gaaggtgtgg agcgcgcggc ggctgctggt gctgctgctc acgccgctcg cgctgctgcc 300 

1 ggtggtcttc gccctcccgc ccaaggaagg ccgccgcttg tttgtcatcc tgctcatggc 360 

J ggtgtactgg tgcacggagg ccctgccgct ctcagcgacg gcgctgctgc ccatcgtcct 420 

1 gttccccttc acgggcacct tgccccccaa caaggtctgc ccccagtact tcctcgacac 480 

\ caacttcccc ttcctcagtg ggctgatcat ggccagcgcc atcgaggagt ggaacctgca 540 

I ccggcgaacc gccctcaaga tcctgacgcc cgtcggagtc cagccggcca ggctcatcct 600 

i ggggatgatg gcgaccacct cgttcLCgcc catgcggctg agcaacaccg cctccactgc 660 

. catgacgccc cccatcgcca atgccatccc gaaaagtccc tttggccaga aggaggctcg 720 

aaaggacccc agccaggaga gtgaagagaa cacaggaaca gaacccaata ccctcctccc 780 

tgaggaaagg ctgaaacttc aagccccccc rgtgataaga cttggccaga taactgagtc 840 

tggtcaatgg aatatgagtg gaaacgatgc gtgcaacctc cgggttctgt cctccctgcc 900 

gggtggaatg cgaatatgat ggcacctggg acccaaagac aggagccaca tcttgagaga 960 

tagatggcag atctgccccc gtggc::t:tgg atcacttacc ccagcgaaca cnacaagcat 1020 

catccatgaa accataggct ttgtgcgcta gctccagtct ctaaaatatg aactaaatta 1080"^" 

aatacgtatc tgttaaaacc taaaaaaaaa aaaaaaaaaa cccgag 1126' 



<210> 61 
<211> 2078 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (33.7) 

<223> n equals a,t:,g, or c 
<220> 

<221> SITE 
<222> (492) 

<223> n equals a,t,g, or c 



! <400> 61 

I gattattctc aatagactgt caacaaatna tcaaactacg gaaacatcct ttagattcct 60 

! gtgacaacgc tttgtgcacg ctctccatca trtagaaaat gaaaacagca gccatctgac 120 

j cagccacgtt atgacaccag acaatactcr gcaaggatct cccaaacagg gagaaactgc 180 

I aatgggaaca atctttggat atctacactg tgticaaargt catgcgcttt atttcactct 240 

catactcata acagctgtat atcacagctt tcattatcca cattatagag gcaaggcacc 300 

gatatcagga acctgagtaa ctggcccaca cagcaantgc agccgaaata gaagccagga 360 

ctgcccgatt cccaagacag cgttctcgta ccgcaggcaa gcccggtgtt gccgggkttc 420 

cctagcactg agcattgcag aaatacggag atttacttac ccaggcgaat gatgaagcag 480 

aacctttatt tngcttttta aattackgac gggkcc)cgag ccaagccagt ccaccaatga 540 

gtctcaaagg gtgaagaacc ggcctgcgta gttttaaaac aaagcagcag ttttcttcat 600 

ttgtagacat cacatttaat ctaccccgca tttccaatag cattaaatat tagaagcgca 660 

Caagctcctt ctttttatgc ccaggggaca cggctcaaca ctgagaaaca cctgagaaac 720 

i actgccacat tctaaggaac aaccaataat aagaaaaact atctcatata aaaggatgca 780 



! 
I 
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gctcgcatat 
ggggtgacct 
ggagcaatgt 
ttttccttta 
atgcacacat 
cttcaatctt 
tgcagaactg 
aggaagttag 
ggttgcaatc 
agtgcttaat 
gcacatcaga 
gattttgtgc 
gtcctctgct 
atattccaca 
gtccgagata 
gaagccagca 
ggccaatatc 
tccagcagca 
tggtgaaacc 
tgtagtccca 
ctcacagtga 
tctcaaaaaa 



aaaatgctga attcaactga agcccaccat gccgccctca cgcgagctca 
ccacgccctg gacccctaaa ggacccagtg tcggatacag ctctgccatt 
gttccacaac cagggctgcc caagtatatc ccttctcctg kcatctatag 
aaggtgcgct tctcaaaaca cgacgttatt tcatatacmc atatatatat 
acatacacmt catgcatigka tcatatatgt gcgtgtatgt atgcatctat 
ccaggaaagt ttgcaaaawa tagccctaag tagaattatg gggaaacagt 
ccgaaatcgg cccagacgaa aaggaaagag caaaaaccca ggctattcac 
tccagccgtg ctttgcgagg catgaaccag gctgaatcag agcaagaaca 
ttaggtcaca tggcaactaa attccagcta tagaatattg aagacatgta 
ttaaggatta ttcaacgcag tgaataaact tacatcattc ccaaaagtat 
caatgagtca actgcacgta acttgaaaac cctaaatatt taactgaaag 
acagaaagac gtcttagtca atgtaaaaat gccagtgatt tgtgtagcat 
aaaagccatt atataaactg aagtcactta tctttgtttc caacgtgata 
tctaacaggc gataactatt gaaaatatta gtttgcctca ctaatattgt 
ttctagttta aaccaagtaa ctaatactgt aaaagaagta ttccttttat 
ttatcctgat actagaactg ggaagagaca cacacaaaaa gaaaacttca 
gctgatgaac atcaacgtga aaatcttcaa taaaatactg gcaaaccgaa 
tgaggcgggc ggatcacaag gtcaggagat cgagaccatc ctggctaaca 
ccatctctac taaaaacaca aaaaatcagc cgggcacggt gacgggcacc 
gctaccgggg aggctgaggc aggagaacgg cgcgaacctg ggaggcggaa 
gccgagactg caccactgca ccccagcczg ggcgacagag caagactctg 
aaaaaaaaaa aaaaaaaaaa aactcgta 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2078 



<210> 62 
<211> 762 
<212> DNA 

<213> Homo sapiens 

<220> 

<22i> SITE 
<222> (10) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (12) 

<223> n ecTuals a,t,g, or c 
<220> 

<221> SITE 
<222> (42) 

<223> n equals a,c,g, or c 
<220> 

<221> SITE 
<222> (219) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (747) 

<223> n equals a,t,g, or c 
<400> 62 

cgagccagtn ancgagaaag cggaagagcg cccaatcgca anccgcctcc ccccgcgcgt 60 
tgcccgattc attaacgcag ctggcacgac aggtttcccg actgaaaagc ggccagtgag 120 
cgcaacgcaa ttaatgtgag ctagcncact cactaggcac cccaggctcc acaccttacg 180 
cttccggctc gcatgttgtg tggaatcgcg agcggatanc aacttcacac aggaaacagc 240 
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catgaccacg accacgccaa gcccgaaatw aaccctcact aaagggaraca aaagctggag 300 

ctccaccgcg gcggcggccg ctccagaact: agcggatccc ccgggccgca ggaattcggc 360 

acgagcacaa atgcccacac aaaactccct tccatgtcct caatactttg taccagatta 420 
tctatatctc ccatggtttg gtctctttct cccgtcttca ttcttctgaa ggcaaaaggg . 480 

gattttttcc ccccttttct gatctgcaac ctgttttgta ttcggacgat cacaggcgtg 540 

agccaccgcc cccagccaca aacacccttt agccgccata agcacaatca agaaatcatt 600 

ttgcaaatgg caagcttttc acgttgcgta ccctttccca tgatcagaga ggcaaaaagt 660 

gscctgggat gcataaagac gccgtagacm tmtgcaatct gccacattat ctctttaaaa 720 

aaaaaaaara aaaactcgag ggggggnacg gaacccaggg eg 762 



<210> 63 
<211> 1094 
<212> DNA 

<213> Homo sapiens 



<400> 63 

tcggcacgag gccaatcaag tgaaacacca cgtaaactgt ccagcagctc cgaaagtaga "60 

gaatgaacaa ggcccctccc ccacccaccc cgcggaaagc ccgtctggct cggcgtcctc 120 

ccggacagcg ccctgccggt cacccctggc cacctcccgg cgcgnggttc agacgcgggt 180 

cctgctcccc cgcccccccc ctcccctgcg cccgcccgcc tctgctgtgc cgggccagtg 240 

ccttggtggc cagtggagcg gacaccagcc gcgactgcgc gggaggggct ggcattgccg 300 

ctgccactgc agggcttggg cggccgacac gggacgaggc ttgcacagcc gccagctcct 360 

gtctcgctga ctttttctac acagctttgc ctgggccacc gccttcagcg ccacgggccc 420 

cctgccgttc aggctgctcc tcacagacga acaaggccct gccccgcgtc ctcacccttt 480 

agagctgttt aaatccaaat gaaccgaaac cgaatacgaa aaatccaggc ccrcagccgc 540 

ccaggccatg tttcaagtgc tccatggcca cacgtggcng gtggacagcg cagctctaga 600 

acattccacc accacagagg gttctgctgg acagcggcct: tigggggctgt ctcgagggtc 660 

cgcctgccag tctcctggca ccaaagtcac tctgccattg tcaagttaca gctattttcc 720 

ttttacctcc aagccactat gtgcgtatgc rgctatgtgt ctgtatttcc cgctaaactt 780 

cctgtcacgg agggaagggt gccacaggcc cagcccccgg agggggcttg gatgtctggg 840 

tggggggaag ggtgccacag gcccagcccc cggagggggt ctggatgtct gggtgggggc 900 

cagggcgcca cagttccagc ccccggaggg ggtttggatg cctgggcggg gctttcttca 960 

cgttccatgt atgataacgg tgaccggggg gtctacagag agaggcat|ac aaatagtgtt 1020 

ggagcgttgt: tttcgctaac acaaatgcct gaacagccaa aaaaaaaaaa aaaaaactcg 1080 

agggggggcc cggt X094 



<210> 64 
<211> 1361 
<212> DNA 

<213> Homo sapiens 
<400> 64 

cccgggtcga cccacgcgtc cgaacttctt cagtgtattc tacgtactr.c tgtattaatt 
acactgggca gactgttcca taaacgccga acagatccta tttgccgg-c aatgctatcg 
ttgtguatcc cgctgatttt ccgcgtagct ggtccaccag ttgtcgggag aagagcgttg 
aagtctacaa ctataattg:: gcacttgccc actacLcctr tcagcccctc cagttccatt 
ccacataLtt tccagccctw gaccggtgca cactaacgat taccatgtct crtrggtgaa 
cagaccattt tataacttta tgacgcacct: ctccrgtcac cttcrttgct ctgaagtcta 
cccgacatta acatagtcac tccagccctic tttggagcaa t:gtcagtgtg atatatcttt: 
tttcgccctt ttacttccaa ccc^acctiac accactatac ratttcaagt gagcttctcg 
tagacagcac tcacttgcgt ctgccacgtc tccaaatcca ccgcacaagc acaaacctgt 
ctctattttc cattcttatc cccatttrga gacagagcct caccttgaca cccagcccag 
aatgcagtgg tgtgatctcg gcccaccgca acctctgcct ccgagggtca agcaatcctc 
tcaccccagc ctcccaagta gctgggacta caggcatgaa ccaccangcc cggctaattc 
ttttcttttt tttttctgca gagatgggga tttaccatgc tgctgaggct ggccccgaac 
ccccgggctc aaacgaccrg cccagcncag ccccccagag aaatatgacc tcaactagca 
catctagatc acttacttcc aanggaacta t:cgacgcatt acaacccaag tctgccatct 
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tatttatcgc cttccatttt tccctccgtc tttcacttct gtgccttttt cctttgcatg 960 

taaacatgtt ttaggactcc gttetgacgc atccagagta ttttttagtg tatcggttta 1020 

tgtagctttt ctagcggatc cctcaagcaw cacactacat atacataacc aagaagcata 1080 

tcagtgtcat cattttacca gttcaagcaa agtacagaaa tccttgtccc ttttacctgt 1140 

cctgcttata atgtaaatgt ctcaaagact tccrtgacac acatccagaa ccataccaca 1200 
gtgttgtctt tgcttcagtc gtgaaacaca atttagaaac tcatgagaat aaaaacccat • . 1260 

tatatttatt tacaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1320 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa agggcggccg c 1361 



<210> 65 

<211> 947 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (67) 

<223> n equals a,t;g. or c 



<400> 65 

ctttctatag ttaaagccgg cacgcctgca ggcaccggcc cggaaccccc gggtcgaccc 60 

acycgtncyy ctuctyggtu ctaggccccg cccccrtgcc cctttgctgc agaagggcag 120 

ctgaaggctc accctagaaa ccgggcccgg cgggtcccac ccggctcact ccctcccttg 180 

tccttacaca tacaggaaga caagacccga gcggcgctgt ctctgtgtcc gtcgtgtacg 240 

gctctccctg tcttcatttc ttctcactcc gtccctaaac ctccctccct ctcccttccc 300 

cctcagtact tartctacag acctatgcgc gtgcycccat ccccctgtcc ttttctctct 360 

tcagctctcc ctgcctctca cacacaatcr tacacgcccc gaggagccaa gtttgggaca 420 

tttaccctcc aggcatctgt gtcccctctt gaagagaaaa cacacagctt cacacatcca 480 

ggcatagggg gcaagctctt ggggcaccag gaccctggag caccaggtcc ctcctggaat 540 

attagatcca cctggagcac caggtctctc taagtcccac ctggggaact cggtcccacc 600 
tgggccacca gttcccacct: agagcaccgc gtcctgccct agagcacaaa gacctgctcc . 660 

tcccgagact ctctctgact gcagccaggc atagcaccct cgcctgcgtt tgctccctgg 720 

tccacagatt tggtggccgg gcaggcgcct ggacagtgat gaggtctrgc cgccttaacc 780 

gtccccccca gtcacttccc ccacaggccc agcaggacgc agtccrgagg atcagggatt 840 

ctacagctgc actaaaatca acccraccca aaaaaaaaaa aaaaagggcg gccgctccag 900 

aggatccaag cctacgtacg cgcgcacgcg acaticaagcr ccgaaga 947 



<210> 66 

<211> 1376 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (18) 

<223> n equals a,t,g, or c 



<400> 66 

ggcgggcggc gcggaagngg cggckgcgcg gccggggcag ccacgccgcc attgtctgcg 60 

gcgcgggcgg ccctgcgggt ctacgcggta ggcgccgcgg tgatcctggc gcagccgctg 120 

cggcgctgcc gcgggggctt cctggagcca gtcycccccc cacgacccga ccgcgtcgcc 180 

atagtgacgg gagggacaga cggcartggc cactccacag cgaagcacct ggcgagactt 240 

ggcatgcatg tcatcatagc tggaaacaac gacagcaaag ccaaacaagt tgtaagcaaa 300 

ataaaagaag aaaccctgaa cgacaaagcg gaacrcttac acrgcgaccc ggcttccatg 360 

acttccatcc ggcagtctgt gcagaagccc aagacgaaga agattcctct ccacgtcccg 420 

atcaacaatg ccggggtgac gatggtcccc cagaggaaaa ccagagacgg attcgaagaa 480 

catttcggcc tgaaccacct agggcacttc ccgccgacca accctcccct ggacacgccg 540 
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aaagagtctg ggtcccctgg ccacagtgcg agggcggtca ccgcctcccc tgccacccat 600 

tacgtcgccg agctgaacat ggacgacccr cagagcagcg ccngctactc accccacgca 660 

gcctacgccc agagcaagct ggcccccgtc ccgttcacct accacctcca gcggcrgccg 720 

gcggctgagg gaagccacgt gaccgccaac gcggtggacc ccggggtggt: caacacggac 780 

stctacaagc acgtgctctg ggccacccgc ctggcgaaga agcctcccgg ccggccgctt 840 

ttcaagaccc ccgatgaagg agcgtggact cccatctacg cagcagtcac cccagagccg 900 

gaaggagcrg gcggccgtta cctatacaac gagaaagaga ccaagtccct ccacgtcacc 960 

cacaaccaga aaccgcagca gcagctgtgg cctaagagtc gcgagatgac tggggccctt 1020 

gatgtgaccc cgtgatatcc tgtcccagga tagctgctgc cccaagaaac acattgcacc 1080 

tgccaatagc ttgtgggtct gtgaagactg cggtgtttga gtttcccaca cccacctscc 1140 

cacagggctc tgccctctag ttttgagaca gccgcctcaa cctctgcaga acctcaagaa 1200 

gccaaataaa cattctggag gataatcacc ccaagtggtc ttcaaccata aactttgcga 1260 

ttccaaagtg cccagttgtc acaggcgcca caaacaacta cattttccaa cataaaaaaa 1320 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaagggc ggccgc 1376 



<210> 67 
<211> 2434 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (10) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (12) 

<223> n equals a,t,g, or c. 
<220> 

<221> SITE 
<222> (27) 

<223> n equals a,t,g, or c 



<220> 

<221> SITE 
<222> (73) 
<223> n equals 



a , c , g, or c 



<220> 

<221> SITE 
<222> (75) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222>J103) 

<223> n equals a,c,g, or c 
<220> 

<221> SITS 
<222> (130) 

<223> n equals a,t,g, or c 



<400> 67 

ctgggggtan cncaagaacc ctccg-ngga cttagacg::c aagcrcttnc ctrtgggcag 
cgtgtctccr ttntncgagt agtgcgccgt gcaaaccaaa czngccggct cgccctccat 



60 
120 
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tccctgacan ttgagacgga atgccccgac cactggcgcc ccgacagaga agccatggag 180 

tcattgccac ctccuggc::g ccctcctgga acgtgaccct gtcagtagag gcctcctagc 240 

tcctactaag acacttcttt ccctaaccac catacacctg gcacgcccca tccccacctc 300 

ccttcccccc accttaaagg agactacccc tttgccccat acrgtcaacc taattttccc 360 

ccgtactccc tctagcgaat gatgcgctac caagcacatg ccaggctgtg agaggactat 420 

actgagtagt agaaagaagc taatttgaaa taaaaattat ttgtataatt aagaaagcag 480 

attagacgca catggrcaac aggaagttga ccgtatgtct gctagctaga ttcaaaacat 540 

cataaagatg atagcacgtc aacatactiag cccagccacc acgtcagccc tcgttaggtg 600 

ggcagcttwC ctgctttttc ccttcctccg tggtgacaac ggaggaaata tccaacagaa 660 

acacgtctaa cagggaaact gggatcacag cttatatgca cctgacttga aaggagtatt 720 

gaggaaggct ttcatatatg acctatcttt ggactaaaaa gaacattcac gaaaccaagc 780 

cttctaacac tagttataac tgagaagcaa cagtaacccc .gtggacagca atcaagctta 840 

aaattgtaaa taaatatggg gataattcag ttgttgcaaa aaaagggcag aacccagtag 900 

aataaagtcc ttttctctta caggtactaa atgaggacag agaacctcag gtgtccttat 960 

gctagtgctc gccgagtgca cactaagaaa gcaattccaa atagatgtat acacctagag 1020 

agagtggtat tagagactca gtgtatgtat ttattcacat gagaggaaac tiggaatacaa 1080 

tcccataaat tattggaata taaccccata aatcaccacc tctcacgact ggaaaacatc 1140 

. tgccaatgaa gaaatggccC gtaggcattt gtcttaagat ctttggctgt ttaacaaaaa 1200 

tgtaacttta acggcttctc atagttgccc tcataaagtg tactgtctaa aatattttcg 1260 

tatcacgtgc ctttgaaatc tgacagctga cttigggzgtt ggatccctgc ccagccattt 1320 

atcagtattia tcattttact cagcagcrgg caggtgcact agacaaacga gacttaggta 1380 

aggaatggaa cctttcctgt ggtt-gactg cacatcacac cagaagactc cagtacccct 1440 

catcccagaa tgaggaaaaa gtattctaca aaga.acrraa tcacctctgt gaaacctatg 1500 

ggatggaaac agtgtggcct taggagcca'a acagtctctg catggcgggg aggaccatga 1560 

tggaacatgt gaatttctac ccctagaagt cgtgaaatag gccccgcact tttgcagaat 1620 

gtccttcttt aaacctggct tactccacag ctgtagccga taacatgacc cggggcttag 1680 

ctgctctagc cctgggttct tggagacctc acactgcctg gcccccggcc atccacctaa 1740 

ggaccgcctg ctttctggtc acacgtggac cctgatacga craagcggcc acatatgtgg 1800 

ttgtgcaaaa gcttcctgtt taatgcacag tgttaccgat ttacatcctg gttttcagtg 1860 

gcactatgtc taggaggcaa tatcctttca aacagcgctt tggctaagat agatacttgt 1920 

gaatcaaaga tagcacagaa atgaactaag tatatcccat ctggaattat attttgatac 1980 

tatctaaaat ggttccacct gtcaaagggc caacagaact cttggtttta cttttgtaac 2040 

tactgtacag aaaactccaa gagtgtttga gtgcttgcca ccaggtgttt cccttaataa 2100 

gtagggatac gatcactcac aggaactata catgaaaaaa gttttcgaaa tgtattttitg 2160 

tgatgtgcca tgttgagggg aaaccaaata trtatgatct taaaacattc gtacgaaaac 2220 

actgtacaat gtaatatgct caactttctc aacttcttgc taatttttct aagatacatt 2280 

aaaaacgctt catacttttt tttaagtaaa acggacccag caagaaaatc aaaaacacca 2340 

gaacataaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2400 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 2434 



<210> 68 

<211> 1086 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (10) 

<223> n equals a, eg, or c 
<220> 

<221> SITE 
<222> (77) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (1056) 
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<223> n equals a,t,g, or c 



<400> 68 

ttgaaacacn ctctcgagca ttaggtccca gctccaccgc ggtggcggcc gctctagaac 60 

tagtggatcc cccgggncgc akkaaci:cgg cacgagcaac ccgtagtgag tcggctgtca. 120 

ctcagcagct: ctggaaaatt acctgtctgc actgaatrtc ctcaccagta aaatggaaat 180 

gattatagta ctgaccctgt aggatcattc caaaaaccag agaagttcat gcagcttaga 240 

acagtgccag gcacatggca aacgctacgg cacactcaag catcttcctc tgr.ggtgcct 300 

cgccatcacc acgtgatcgc gccztgcLtg tccctgttcc acttctcaga gggagaaaag 360 

tggccaacct caagaatcaa aaccctgacg ctaccccggg aaatgcacag agccagagag 420 

acacaatttg acttagtatg acccacatca ccccctcagg ctgaatagtg gtggcatgca 480 

catctatacc aaaacgtttt acccctttng tagaaggaaa stpatttgtat cccctattcc 540 

atatcttaga tctttacaag agcactcaag tccaacctcc taagaaactg ccaattttgt 600 

tgatcatgac agtctgcaca gattttcgta ctacttagtg ktgggagtgc ctcagggacc 660 

atcaacaaca ggsccctcct cttacccacg agactaccga ggccccggag gttatctgtc 720 

catccatggc gggtgcacrg ctagaggtta Lctggttagt agcccaacta agaccagaac 780 

ccaggaattt tgatttaatc tcaaacgacc tcttttacrt ccgcatccgg gaaggagaag 840 

aggaaagtaa acgggaaatt catgttanct arggaaaagt catcagtttg atgtttacta 900 

aatgatct'gt: ctcaggagat cgtttacaac ctctatcttc ctagacaaca- aztttctgca • 960 

agagtaaagc cacggtcact aaggcagtca catcaatgca ctcagcaasc cgcgaagaaa 1020 

aacatatatc aacgttctcc aacaaaacac agtgancacc tgaaaaaaaa aaaaaaaaaa 1080 

aaaaaa 1086 



<210> 69 

<211> 1262 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (568) 

<223> n equals a,c,g, or c 
<220> 

<221> SITE 
<222> (639) 

<223> n equals a, eg, or c 
<400> 69 

cctyctatag gkmaagccgg tackcctgca ggcaccggcc cggaaccccc gggccgaccc 
acgcgtccgc ttacaactcc cccyccgctt czgtgttzct zccacaccaz tacggcgtat 
ttaccgatct acaaccggti- tcaagccagt cgcctgctta aaactttctc acgaattctc 
atcacatttt caaacaaaat ccaaagttct tacc.atgacc ttgrggacat ctctctgacc 
tcatttacta ccgttacctt ccttattcac tccautccac ccacagtggc cztcttgcaa 
gtcagagaan gcaacaaana tgctcccgcc rtggggcact cgcacttccg titcrtttggc 
ctggccactt cacctccaca ctctccattt ctcacgactc tccccccata ttcggacccc 
tgttcaaatg tcacccctca aaaaggcccr caaaaccacc cratccaaaa cagcacccgc 
cactgccatc cccacaccca ttttcatutt cccatagcac ctatccccac crggcactat 
gttatatatt rgcttatccg kccaccgncc g^ccccccca ccggaccgta agccccatga 
ccgcaagtac acagacgaac cacaaaatga atcaacggr.g aawcatccct gmcacatact 
atttaatgar tctgrncccca ctttiaagtaa aaaaaaaaca ccatctttct cacccctcaa 
agtgacatag cttaatccrc taaaccacat ttttcccacc cpccgattta antaaatcag 
tgacacaaaa aaaagagcga tggggacatg cgaaagaaga ccaaaataga tgccaggaaa 
cactaaaact gctgaagttt agtggcacac ttcttcttca cacacagtat tatttgagtt 
actaatcgtg ccactgaacc acagaataag caaaatacta ggcaaacaga accacgcrtg 
ggggctacat cctgtgcaaa acttgtgcta tgcaaaaata acatcaaata cttaatcacc 
acagt-ttcgt tacctctttc ttactctagg aaatgatc-g cagctgagtg aancaggaag 
tgacagtgac gactgaagaa acatytagct acaaacaaaa atttacacag cacgtataat 
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ttattttgca ttaacaataa aaactcctaa gactgaggga aacatgtctt aacctttgat 1200 
gacaaaagaa actaaatttg atccagaaac ttcaaaaaaa aaaaaaaaaa aagggcggcc 1260 
ac 1262 



<210> 70 

<211> 1642 

<212> DNA 

<213> Homo sapiens 



<400> 70 

ggcgcgctgc ccggarctgc ctgggttgcg ctgccggcca . cgtccccgcg ccgggcctca 60 

ggctccttcc tactgtccga gggccaccag gccgccgggg gcctgctgcg cccggatgcg 120 

tctgtcacta gagtggagag tctaccrtcg tctcacatgt gccacaaagg atggcatggc 180 

ccgggagtgc cccaccacgt ggctttcacc ccctgcaaag ccagacctcg cccagcgaca 240 

cagtgtcaag cccacagccc tccaaggagg aagacggtcc aggctgggag catcycctta 300 

gcagcagcct ctgatccctt ggccaagcag gagggaacca ttagcagccc gaggagctgg 360 

ctggctggga gcctcgggga ccgcccagcc ttgctcccag ctcacccaca agatgtggac 420 

agctcttgtg ctcatttgga tttcytcctt gtcctcacct gaaagccatg cggcacccaa 480 

cgatccacgc aactccgccc ctaacaaaat gtggaaggga ctagtcaaga ggaatgcatc 540 

tgcggaaaca gttgataata aaacgtctga ggatgtaacc atggcagcag cttctcccgt 600 

cacattgacc aaagggattc ggcagcccani cccaactcca cggaagtcac aacagaggac 660 

acaagcagga cagatgtgag cgaaccagca ac^tcaggag gtgcagccga cggtgcgacc 720 

tccattgctc ccacggctgc ggcctccagt acgactgcgg cccccaccac gaccgcggcc 780 

tccagtatga ctgcggccnc cagtgctccc acgactgcag cctccagtac aactgtggcc 840 

tccattgctc ccacgaccac agcctccagt acgaccgcgg cccccagcac tcccacgaca 900 

cttgcactcc ccgcgcccac gcccactcyc acagggcgga ccccgtccac taccgccacc 960 

gggcatccat ctctcagcac agcccccgca caagtgccaa agagcagcgc gctgccaaga 1020 

acagcaaccc tggccacatt ggccacacgt gctcagactg tagcgaccac agcaaacaca 1080 

agcagcccca tgagcactcg tccaagtccc tccaagcaca cgcccagcga caccgcggca 1140 

agccctgtac cccctatgck tccccaagca caaggtccca ttagccaggc gtcagtggac 1200 

cagcctgtgg ttaacacaac awataaaccc acair.ccatgc cctcaaacac aaccmcwgag 1260 

cccctcaccc aggccgtggc agacaaaact ctccttctgg tggcgccgct actcggggtg 1320 

accctcttca tcacagtctt ggtcttgttt gccctgcagg ccracgagag ccacaagaag 1380 

aaggactaca cccaggtgga ctacctaacc aacgggatgt acgcggactc agaaatgcga 1440 

Qgggggcggg ggcctggcgg gaggcccggc cccctcctcg tccitcccct ctgccrttga 1500 

gaccaaacca agtgcttcca aattc-tttg g^gcaatcga ggagatacgc cagacgcrta 1560 

aacacattta atcgctgcca gattaattcc acgarcacca aagagttgcr gcCuttttca • 1620 

taaaaaaaaa aaaaaaaaag gg 1642 



<210> 71 

<211> 921 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (4) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (9) 

<223> n equals a,t»g, or c 
<220> 

<221> SITE 
<222> (11) 
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<223> n equals a,c,g, or c 
<220> 

<221> SITE 
<222> (15) 

<223> n equals a, eg, or c 
<220> 

<221> SITE 
<222> (20) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE • 
<222> (901) 

<223> n equals a,t,g, or c 



<400> 71 

gcgnggggna naggnaagcn ccccaccatc gggttcaaaa gctggagctc caccgcggtg 60 

gcggccgctc cagaactagc ggaccccccg ggctgcagga awCcggcacg aggcctgagc 120 

agataagatt aagggctggg tccgtgctca atcaaccccc gcgggcacgg gggctgggaa 180 

gagcaaagcc agcggtgcct acagccagca ccacgccggg cctgccgtgg aagggaggtc 240 

tgtcccgggc gccgctgctg ctitccczcag gcccccagac cctigctgatc tacgcccggc 300 

atttccacga gcaaagggac tgtgatgaac acaacgtcac ggcccgttac ctccctgcca 360 

cagtggagtt tgctgcccac acattcaacc aacagagcaa ggaccactat gcccacagac 420 

tggggcacat cttgaattcc tggaaggagc aggtggagcc caagactgta cccccaatgg 480 

agctactgct ggggagaact aggtgtggga aatttgaaga cgacattgac aactgccact 540 

tccaagaaag cacagagctg aacaatactt ccacctgctt ccccaccacc agcaccaggc 600. 

cctggatgac tcagttcagc ctcctgaaca agacctgctt ggagggactc cactgagtga 660 

aacccactca caggcttgcc catgtgccgc ccccacaccc cgtggacacc agcactactc 720 

tyctgaggac tcttcagtgg ctgagcagct. ttggacctgt ttgttatcct attttgcacg 780 

tgtttgagat ctcagatcag cgttttagaa aatccacaca ccttgagccc aatcacgtag 840 

cgtagatcat taaacatcag catttcaaga aaaaaaaaaa aaaaaaarct cgaggggggg 900.. 

nccggtaccc agggcggaag a 921 



<210> 72 
<211> 906 
<212> DNA 

<213> Komo sapiens 
<220> 

<221> SITE 
<222> (34) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (833) 

<223> n equals a,t,g, or c 



<400> 72 

ggaaactctc cctcactaat cggaacaaaa gccngagctc caccgcggtg gcggccgctc 60 

tagaactagt ggatcccccg ggctgcagga attcggcacg agggaagaga gaggggaggg 120 

tgagcagagg acaggccggg agttttccgg gaacggagga agagcagtgg aggctgccag 180 

gatgaggctg ctgtgtggcc cgtggctgtg gctctcctcg ctgaaagtcc tgcaggccca 240 

gaccccaacc cccctgccac tcccgccccc gatgcagagc ttccaaggaa accagttcca 300 

gggggaatgg ctcgtcctgg gcctggcggg caacagcttc aggccggagc acagggcgct 360 

gctgaacgcc ttcaccgcaa ctttcgagcc aagtgatgat ggccgctttg aggcgtggaa 420 
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cgcgatgact cgaggccagc actgtgacac atggtcttac gcgctgarac cggcagccca 480 

gcctgggcag ctcactgtgg accacggcgt gggcaggagc cggctgcngc cccccgggac 540 

gctggaccag ttcatccgcc tgggcagagc tcarggcccc ccggacgaca acactgtctc 600 

cccagatgtg actggargtg ccctggacct carcagcctg ccccgggtgg cagccccagc 660 

ctgaccactc agacagccgc ggcccccaag gcctgactcc tcttgtggga gggcgaggct 720 

ggtcacccca ggccagcgtc tgtcgaagga tgaagcagct cctgtccggc ccagccctgc 780 

ctcacagctg tgcgagctct gccctcctca gcccccaaac ctgaacaaat gcnccaagcc 840 

cagaaaaaaa aaaaaaaaaa aaaaaaaaaa cccgaggggg ggcccggtac ccaatccgga 900 

agattg 906 



<210> 73 

<2il> 680 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (7) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (9) 

<223> n equals a, eg, or c 
<220> 

<221> SITE 
<222> (IS) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE 
<222> (16) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (22) 

<223> n equals a, eg, or c 



<400> 73 

cactcantng aacannagcc cnagccccac cgcggtggcg gccgcccriag aactagcgga 60 

tcccccgggc tgcaggaatt cggcacgaga catttcgccg gaccccagaa aagccaccac 120 

gacctgtggg ccatgacgct accccaacgg ctgctgccgc tgctccttct cttctccttt 180 

cccttccccc tcaccagggg czcactzzcz ccaacaaaat acaacctttt ggagcccaag 240 

gagtcttgca cccggaacca ggaccgcgag accggctgct gccaacgtgc cccagacaat 300 

tgcgagtcgc actgcgcgga gaaggggtcc gagggcagtc tgngtcaaac gcaggtgntc 360 

Ctcggccagc atagagcgcg tccctgcccg cggaacccga cccgcataca cticaaagaac 420 

gagaaacggc ttagcatcgc ctacggccgc tgticagaaaa ttggaaggca gaagctggcc 480 

aagaaaatgt tcttccagcg ctcccccccc ctcgctgscc cctcctycty cacccgccct 540 

cctccctacc cagagctccg cgktcaccct gctccccaga gcccccacca tgagtggagg 600 

gaagtgggga gcgactgaaa taaagagcct: tctcaacgaa aaaaaaaaaa aaaaaaaaaa 660 

aaactcgagg gggggcccgg 680 



<210> 74 
<211> 1633 
<212> DNA 
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<213> Homo sapiens 



<400> 74 

ggcacgagca tcagcaagca ggagctaggg aagagaaagc atgcaaaagg gcaagaacca 60 

ggagaattga catagtagct ggacagaact ctagatgtgt gtgcgtgaga gaaagagagg 120 

cagggagaag gagggaggag ttacccccac gatgacctcc aacttccctt tctgcaccct 180 

catcctgggg atagcacagg cccaggccng ccctggrtgc cctggcgatt ggcctggcct 240 

gggctcaggg gcgggggagg ggctgcacca cattaggacc tgccgtactc caatcccacg 300 

cagtccccct gcccccgctg ctgcgtgcct gggccctggt cacgccaggc ctcctcgtgt 360 

cctgcgtctg tggccggctc ctgccaacct gcccagtcct ttcaggctcg aggccctgca 420 

ctgcccttcc tggzcctczc ccctccttcc cgccccccac ctagcccttt ctgggttccg 430 

ggacctgctg acagactttc ttcttgctgc ctgcctgctt acatctcaga agacccctcc 540 

ggaactgccc atggctgtgg tccacctgct ggtagcaacg ccctgttacc aaatgctaga 600 

taatctgcca ctcccctctg cagccgccaa ctggtgctga gctgccacct gctgtctgtt 660 

tccacaccct tgtacactgc attccgcctg ccaccccgga cagctgccca gcagccgcag 720 

ctgaggccgt cctgccagca cctaccgcag ccccatgggt ggtcctgtcc agttcccggc 780 

gctcccagcc cgcctcaccc gctgtactta cagagccaca tccggttgaa tacagagctg 840 

gggtagcctg gacgggccca cggcacaccc actcccagag aggcgggcat cctctccccg 900 

gacctctagg gaagggcagg gtctggagcc caataataaa acacagtaat gataatcaca 960 

gccaatgttt actgagaact taggatgtga taggctcaat gatttaccag aattactcta 1020 

cactgcataa ccaccctacc aggtaagcac tattaacatc ccccccccca catgaagaaa 1080 

ctaaggctaa ccaacagaaa tgaacccgtg taatgcccca cgggcataat tcacggagcc 1140 

aggattcaga accaggcgct ttacttcagg gctgaagccc ataaccacta gaccaaagcg 1200 

cctcctccgt gacctccggg gaaggcccag caacaccc-c tatcgcgtcc aacgacttct 1260 

ccctggtctg agatccaccg actcacaggg gcagggttaa ggttaaggcc agagtctcag 1320 

ctaagctttg gataccctct cctggattcg gagaaggctg gaacaactga atttctgctc 1380 

atcttcagga gcggcctact agagccacac anttcccagc tctccttgtg tgtttcccag 1440 

acccttcttc catcgtgccc tctcccttag gctcctggaa agttttcaga gagaatcacc 1500 

cagtggtaac attgttaaac aaaacaggaa . aacgggactt gcgcgtatat acgattaaat 1560 

tactaattga tggatctacc ttcttagctc gtgccgaatc cgatatcaag cccatcgaca 1620 

ccgtcgaccc cga 1633 



<210> 75 

<211> 1022 

<212> DNA 

<213> Homo sapiens 



<400> 75 

ttcccgggtc gacccacgcg zccgcccacg cgcccggczc ggggccagca ccctgtctca 60 

aagacggcaa aacgaggcra gccccggacg agccagccgg tgcgggttcc aaccatagga 120 

acacactgat gcccaaaccc taaggtgcca agccccaggc ccrggaggct ggtagaacag 180 

gatctatgcc cggaatcctg gcagggactc crgccaagga crrgcgttta agcctgcttc 240 

agggcttcag gctigctcccg ctctgcgcct gcccaggctg gcrgagcggg tggatgggcg 300 

gacagaaggg ctcaccaagg attgcggaca cagggtaggc cctggcacca cgggtttcag 360 

gctgttatca cttcccttgt aggaaca^ag ccagaagcag atgagccagg gcagagggcc 420 

ggccccncct ctcaccttcc cctcagtcct aaattgtccc cagcgatggg aagaggccag 480 

ggactgcaac ccicgtgcrg cgcatcctct: gagcctccgc tcactctcag ggccaagcag 540 

cccccaagcc ggggccctcc cccggccaaa acccgaggag cagcctaggt tacaggccct: 600 

tcggtaggta ggctctggct gcctgtcaac gcagctaggc cccccgatta ggtacagtga 660 

gaaacaagct agaacaaccc tggcccagaa gaccgtgcac tccagcaaga cccagggatg 720 

atagccttgc agggccaccg ggagttcgtg cccaagctcc cccctcttct ccccccaggg 780 

ggcactggga cnggcccccg ccctcazcct tagcctgggc cttccccaga ggtattaaag 840 

agaagcatga tccctctgcc cccagccct:c ctcaggggca tcctgcccat agcacccagc 900 

ccccaagggg cccccagcca cgtggcgaag cccagcactc acgcagctct tagggaacca 960. 

aaaaccagca ccgaaataaa gctgaacgac tgaccgaaaa aaaaaaaaaa aaagggcggc 1020 

eg 1022 
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<210> 76 

<211> 1184 

<212> DNA 

<213> Homo sapiens 

<400> 76 

agcaaaaggt gaagagagaa aggaagccct ttctccaaaa atggtgcagc tatcctctga 60 

accaatctcc ttcggtttaa tgtacctgca tcctggggtt tttttccact taacttatcc 120 

tggagctctt tccataacaa' cacttggaaa gcactctcat cctttttcca ctgctgaaca 180 

gaatcccact gtgtggacgg aacacacccc acttcaccag tcccctgtag ccagtcactt 240 

ggtttgtttc caaccttttg ctttctcaga gcaataaccc cgcatgccta tcattttgta 300 

tgcatacagg tttatatgta ggaaaaattc ctagagcagg attgccggac caacggataa 360 

aagtatattg tggacagaca atgccaaatt gccttccaga gactgtggcc ctgcgcaccc 420 

caccaggcat gtgtgactac caaagctcct gccagctgct ctattttatc tcctttccag 480 

tctcaggctc aacgcagaac ttcgaggcaa gctttcctaa aatgtaggct cctaaacgcc 540 

acagccagct ctgccacatg aaggagagct caaacgagac agaaacagcc tctgggcagg 600 

atttccatcc tgcacagata tatctcccac acticcgggaa accgtgaagc tcccagagcc 660 

acaattcccc agaaacacat ccccctgctg gtacagccaa gccccagaac aagctgtgct 720 

tgcccggcac ctcaaagcca agcaccatgg atgccactcg ccatgggtgc ctgcaatttc 780 

aaataacgag aaataagaaa cttcagcctc tcagcccccc cagccaacat ttcaggcgca 840 

tgacagcctc aggtggcaag cagctactc:: gccggacagg gcagaagatg gaacacccca 900 

cccccgggga agatcacgag gccaggagac caagaccacc ctggccaaca cagcgaaacc 960 

ccgtctctac r.gaaaataca aaacactagc cqggcgtggc ggcaggcacc tgtagtccca 1020 

gctactcagg aggctgaggc aggagaatgg cgtgaacccg ggaggtggag gcrgcagtga 1080 

gccaagattg cgccactgca ctccagcccg gcaacagagc aagaccccat accaaaaaaa 1140 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 1184 



<210> 77 

<211> 312 

.<212> DNA 

<213> Homo sapiens 

<400> 77 

aattcccggg tcgacccacg cgcccgcgat gagtggacrt gtactcctac ccaggtcctg 60 

agggccagcc cacccagcat ccccaccccc gatgacgcrg tccctacaac tggctgaact 120 

ggtgcatttt gcgtgcgccc tccagagcca gcggaczggt gcgtacccaa cgacgccacc 180 

tictgaaacct: acagaaccac tatgccttgc azgzgzaccc -gcagggtcc gagggccagg 240 

ctgtctggca gctctgctcc rgggcgacag agcaagac-c cg"ctcaaaa aaaaaaaaaa 300 

agggcggccg ct 312 



<210> 78 

<211> 1370 

<212> DNA 

<213> Hoiao sapiens 



<400> 78 



tggttaaaga gtacgacatt ttagccaggg cccagcacag cgcccggccg aggagcgccc 
ttatacgctc titggaagcac atiggggactc cacctccgtc actg^ggcga ccccactitac 
gcatccacca tgccaatcca aggcagccga tggctcagga aagccagaga ccgagatgtt 
aaaatcctcg gggctaccca ccaacacgrc zccazzccac crgcragggt caaaggcrct 
tctaacctgg gccccgacct tagcacagat ccgcccarat cctitittgaag tccagccact 
tggactatra gccctaaact tcctccgtac rgccactgca gggctgaagg agcctcgcac 
gcacccacca agtctggctc tcacacccga atcccacacc ccgccctcac ccctagctat 
tccacctccc tgtggaacac cagcgtcact cagcaacagc catacaaccc cactaccctc 
atacctacct cccccttcaa agtctcggat gcccgaraca ttgcaccrgc cagcgcattc 
acccccacca gtacaccccc aagtccttcc agcgaaagac gcaacaactg aatggccact 
gtgccaaagg cgcctgggcc ccacctgcca ccagtgacgg ggtccccaca gccaacccag 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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tggtgtatca cccattccca aaaccacccc amcgcctgtt zzcztggacc actcccggca 720 

ccaactgctc caggccagag cgaccgaaaa cttgggktac aagacactta tcagagacca 780 

atatctacta aaaatgtgaa aggaagcagg atckcgccga ggaagaagac aacaacaaag 840 

ataccacaac gtcagcccat caaagcccrc ggccaaccca gcasaaaaac ccggagcagg 900 

cgcayctctt tcagagtctc cccagccagg ccaaaatgtg cgcccctaca cccgcacccc 960 

ccccaatcac gggcttcagg ccgtcccggg cacgacctca gatgaagcgg ctcacacagc 1020 

tgaggctaac gttgtcggag ctigacagccg aaagccgttt gctgaccaca cccccacagc 1080 

cgagcagcat gcccttgcrt ggaggaggac rcggacgaca cagctccacg tccgccgcac 1140 

Cutgggggtc acattcctca cacccagctc tgcctcaaat ggcaacattg acaaaaattc 1200 

attccagctg caaaccaact - gtagaacact acaaaaccag agcacaagtc taatagcata 1260 

aatcccatcc taagtggaaa agaaagggcc atctatccac acat-tttgg ggaaaaaaaa 1320 

acaaccttcg ctatgtctct attacaacac ggatacccac aaaaaacagt 1370 



<210> 79 

<211> 358 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (5) 

<223> n equals a,t,g, or c 
<220> 

<221> SITS 
<222> (13) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (35) 

<223> n equals a,c,g, or c 

<220> 

<221> SITE 
<222> (351) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (367) 

<223> n equals a,t,g, or c 



<400> 79 

aatgnaactc ccntcactaa ttggaaccaa agccngagc" ccaccgcggt ggcggccgct 60 

ctagaactag cggacccccc gggccgcacg aatccggcac gaggtccagg zgtctctatt 120 

tcagatgtnc tgtttntccc czazztzctg cccacacgaa cacacacacc tgccaggcac 180 

actctggcct ttcrtacctc tcitec-gai zzzgcctccz zcczgccccz gtzzzcztcc 240 
czttzztczg gctatagaaa ccgtcagg-g gcctcgccgg carcacccta ccccctcnga i 300 

actgtgtrac ccaggaaccc ccacctacra cgccccgagc gggggcccgg nacccaaccc 360 

ggaaganc 368 



<210> 80 

<211> 1088 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> SITE 
<222> (1) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (4) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE 
<222> (5) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (9) 

<223> n equals a,c,g, or c 
<220> 

<221> SITE 

< 2 2 2 > (11) 

<223> n equals a.t,g, or c 
<400> 80 

nttnnaacng naaatccccc cttactaccg ggaacaaaag ccggagctcc accgcggtgg . 60 
cggccgccct agaactagtg gatcccccgg gctgcaggaa tccggcacga gactccccag 120 
aattggttca ccctgggaaa ggaaggccta ggaagctgat gacctacttg ttticgttcac 180 
ccatttctct actgcttctt aaagcgcacc caagcggtca ccaggacacc agaaaagcca 240 
aatccaaggt cccaaggctc ttgatcatcc agtgcccaca acagagggag caagctagac 300 
ggagaagctg aagtgaagga ggaaagagag aagggggtgc ctgtitccacc cccctcctag 360 
cagctgcctc tgcgggctcc acaatcccca tgccczczcc cccaccccac cctccggccg 420 
mccacctctc ctccacccac cccattcatc agcaggaggg caccgtatca gcacagcctc 480 
tcaaggcctg acagttcctt tccgggtgcc ccccacatgc aaccacttaa cccccagcac 540 
aaagggccag cccgaacggg ctrggagccg gcggcccgcg cgraggggga ggcctcaata 600 
ggatttgggg agccgggtgc acaaaccgcz ctgczccacc accccccaac ggaggagctg 660 
gagttggggc acgtgccgag gggaggagct cggcagc-gc atcccgsccc -gaacaatga 720 
tctattcatt ctccnggcac gcacagagaa gacaggctgg ggaaacacgg ccccacaaat 780 
cgcccccagc ctcagccgac ccccacggca arcatrgcag ::ccagagtgc cccctcccac 840 
ttgtaacgct ttgcctcaag cttgccctzc czctzcczcg taaggaagta gaccacacca 900 
cgggaggcct caagcagcct taaatt-cagg caagcaccat cctctgagcc ggttgcaccc 960 
aaactcaacc tcaccctctc aaagcccccc ggccaggt:::! cccccacaca cccatagaaa 1020 
cagacgcaga caaaaaaaaa aaaaaaaaaa actcgagggg gggcccggaa cccaattcgg 1080 
aagatttg 1088 



<210> 81 

<211> 1862 

<212> DNA 

<213> Homo sapiens 



<400> 81 



ggcacgagct ctgcaaagct ccczctgczc cagcacgrcc ngtactttct caactttgga 

cgtttgcaca agctattatc tnticgcaaga atgcccrgcc cccccctatc catctgtgga 

agtcaacctt tcctcagggc ccagaggaaa acciatcacc tcccacacct ccccagatgc 

ccaccatLtg ggttaaaccc tgcczcczcc aggtctgcca tggccttttc cczctcttaa 

agcactggcc ccagcctacg ccattgtgtg tcacct::ggc Lcccgtctcc cactggccgt 

agatctttgg gargggagga accggggaar tcccaccctg gggctacggg acagcgctta 



60 
120 
180 
240 
300 
360 
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cccctaagta 

tccaacagcc 

cagtctcatg 

cctgctgggc 

gggccctccc 

gggtgatgag 

ccctcaagta 

tcctgctcct 

tcctcccttt 

caaggtgttc 

ctcaacggct 

tcttcacctg 

ccgagacaat 

aagttgttaa 

atttttctct 

ggtgactgsg 

aggagttgtg 

cccaacaatt 

ggctgctttg 

ccttaattga 

atatttggaa 

ccacgtgacg 

atgtaaagac 

taaaagaagt 

ccagcctggg 

ag 



caaaataagc 
cctaaagatg 
gatttagcag 
cagctgtggt 
ccctggacct 
catccggaag 
ccctactcac 
catcctaccc 
tctctaccat 
tttccatacc 
acccactgcc 
gcccttttga 
cttaattttt 
atggtgatta 
accttccctg 
taaccaggct 
caaatgaaat 
gcagcttcgc 
ccmtggaagt 
ctttgncatt 
gcaggc tcaa 
cagctitgtcc 
Ccaaaaaggg 
gctgtccagg 
ccatagaatg 



aagaaccctc 

gggtagccca 

cccccgtagc 

ctccgagaat 

gggcccatgw 

ccagctcgct 

acggcccaag 

acccagccgc 

ccazccczzz 

aaaaaatctg 

ctcaggaraa 

gacaagacga 

gcagctgtcc 

atttatctgg 

ctgcacacrg 

caccaaggca 

tgccgacaaz 

tgcaacacgg 

ggggttacag 

tcastgagaa 

tacacaatac 

atccccgaa- 

gatggcacgr 

aggtggaggt 

agactcctcc 



gaattgaaaa 

crgctcccca 

cccaggatct 

gcccagctcc 

gaccaagtgg 

tgcctgcccc 

agggaaccac 

tccgggcaag 

ga ccacccaa 

atcaggccac 

aaacraaaat 

tccacaatct 

aagctattta 

cccgaatggc 

ggacacagca 

gtcagcgcag 

accaaaggaa 

cagggtggga 

ggcagggaga 

c-rcacaaatt 

agzcczcccn 

cccrct-taat 

aaaccacagt 

cgcagtgagc 

wcaaaaaaaa 



gccccagcga 

::ataggaagg 

trcccaaugt 

ctgggctagg 

ccagccagcg 

aatcacagcc 

tcttctgcca 

accccgtggg 

ccccrcarcc 

cacccctctg 

tcccaacagg 

ttccaaagcc 

agactgatga 

akgtgaattc 

aggccaggac 

agaagaagca 

ccccaagaga 

caggggtagg 

ggcttaccat 

gagccactct 

ctagaactcc 

gacaaaaaca 

cggaacaata 

rgagactgca 

aaaagaaaaa 



cttgatagct 

aagaggccca 

ccccagctcc 

ccggcccatg 

agagagcccc 

c!:gggcccca 

ttccctcccc 

catgatggaa 

gccctccaac 

ct taaaacca 

acttacaaat 

aacaagcatc 

ggnctgcamc 

caaggggatc 

acccttycta 

gaaaacgggg 

ggctcagcct 

ctcggacctg 

cactgagcat 

tcattgagtt 

cgcttratgt 

zttggaagga 

cg::ggccact 

ccaccgcact 

aaaaaacccg 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1862 



<210> 82 
<211> 1618 
<212> DNA 

<213> Homo sapiens 



<400> 82 

ccacgcgtcc 

gaggccacag 

cccctgctcc 

gtggccagcg 

tcctcagccc 

accaccatcc 

ggcagggagg 

atgggcagcc 

acggccctgc 

cttgatgggc 

tccttcagcc 

aaccagaaag 

cagagtgtga 

cagcgcgaga 

ctcatcaagg 

ctgatacggc 

cccagggctc 

ccccagggcc 

catgcccctc 

cgggtgggca 

ctgcaggaga 

agagcaaagg 

ccccctagcc 

ccactggtgc 

acagtcgagg 



gcaaggagcc 
gacacagggt 
Ctctcctgac 
actgccaacg 
crtcccccgg 
tgaagggtga 
gcccccaagg 
ccggcgcccc 
acagcggcga 
gctttgacat 
tcaatgcgca 
aggccgncat 
tgctggaccc 
acgccatcta 
ccgaggacga 
atcctgcgag 
cgccagggcc 
tccagccgcc 
ccaccggccc 
gcccctcgtc 
agggccacgg 
agggacctca 
ttccaaaccc 
tcatgcagac 
gagccgtggc 



agaggccacg 

caccacgggg 

gtgcgagatc 

gcgctgtgac 

ccgcccccac 

caaaggggac 

ggagcccggc 

gtgccagaag 

ggactrccag 

ggcgaccggc 

cagccggaat 

cccgcacgcg 

ggcccacggg 

cagcaacgac 

ccgagggccr 

aagacctgcc 

ctgctcagtc 

cccagacacc 

agcgccccga 

tcagagccc" 

aagccccagg 

ggcctcccgt 

aggtggcccg 

tccggggccg 

tccacggcca 



cagcggccca 
acagccgccc 
ccta^ggtgc 
ncLgaggacc 
gcccrgcc-g 
ccaggcccaa 
cc tcagggca 
cgccccctcg 
acgccgctct 
cagtctgctg 
racaaggaga 
cagcccagcg 
gaccgcgtcr 
cccgacaccc 
ccgggccacc 
Qzcczcactg 
ccrcccacca 
gatgtccgtic 
ctccccaggc 
cczccggccz 
ccctagagcc 
zzctzczzcc 
cccctciccc 
aggcgccccg 
gacgacggaa 



gggcccgcga 
tgggrcccgc 
agcicacct c 
ccczggazcc 
agaccagacc 
cgggcccccc 
gcaagggtga 
cczzczcagz 
tcgaaagggc 
czcccctgcg 
cgcacgtgca 
agcgcagcat 
gggcgcggcc 
acatcacctL 
cccccggccg 
ggatccccct 
aagccatccg 
cccaggcgct: 
t::taccaagg 
ggcgctgcct 
ctcagcaggt 
agggtggggc 
cagagggagg 
gggggtgacc 
acagggcctg 



gccgcctggg 
c"gggcagcg 
cgacagagct 
tgcccacgtia 
ctiacat taaz 
agggcacacg 
caagggggag 
gggccgcaag 
ctctgcgaac 
tggcatccac 
cattatgcac 
catgcagagc 
cctcaagcgc 
cagcggccac 
gagagcLcag 
cczgcczccz 
aacttccgtt 
czczgccccz 
cgctaaggcc 
ctacaaacac 

c-ggggagcc 

ggcctggtgt 
cggcctccgc 
tccggtgccc 
accaagcgcc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
.1440 
1500 
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aggaagacct: gtgctataaa ccaccctgcc tgancctgcc cctgcctgac cccgccacgc 1560 
cctgccgtcc agcacgacta aagaatgctg tccccccttg gaaaaaaaaa aaaaaaaa 1618 



<2I0> 83 

<211> 2034 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> SITE 
<222> (14) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<22:2> (382) 

<22'3> n equals a,t,g, or c 
<22^> 

<22'1> SITE 
<222> (1999) 

<223> n equals a,t,g, or c 



<220> 

<221> SITE 
<222> (2027) 

<223> n equals a,t,g, or c 



<400> 83 

actcactaag ggancaaagc tggagcccca ccgcggtggc ggccgctcca gaactagtgg 60 

atcccccggg ctgcaggaat tcggcacgag atccaccggg agakggaycc tgaaagggcc 120 

ccggccagcg tccctgagac gccaacggca gccactgccc cccattccag cccccgggat 180 

acgtactatc agccccgtgc cccggagaaa catgctgaca gcatcctggc actggcttca 240 

gtattccggc ccatctctta ttacnccrct ccctccgcct tcticctactc ' gcacaggaaa 300 

ggttacttga gtccgtccaa agtggtgccg ttttctcact atgccgggac attgctgcta 360 

cttGtggcag gtgtggcccg cncccgaggc attggccgcc ggaccaaccc ccagcaccgg 420 

cagttcacca ccatctcgga agcaacacat cggaaccagc ccicagaaaa caagaggcag 480 

cttgccaact acaactrcga cctccggagc tggccagccg accciccactg ggaagaaccc 540 

agcagccgga aggagtctcg agggggcccn tcccgccggg czgcggcccz gctccgccca 600 

gagcccccgc accgggggac agcagacacc ctcctcaacc gggctaagaa gctgccttgc 660 

cagatcacca gcracccggc ggcgcacacc ctagggcgcc ggatgctgta cccaggctct 720 

gtgtacctgc tgcagaaggc cctcatgcct gtgccgctgc agggccaggc ccgactggtg 780 

gaagagtgca atgggcgccg ggcaaagctg ccggcccgcg acggcaacga gactgacacc 840 

atgtttgcgg accggcgggg gacagccgag ccccagggac agaagctggc gacctgctgt 900 

gaggggaatg ctgggtttta tgaggcgggc cgcgnctcca cgcccctgga agctggatat 960 

tcagtcctgg gccggaacca tccaggctct gccggaagca cgggggtgcc attcccgcag 1020 

aatgaggcta acgccatgga cgi:ggtggn,c cagtttgcca tccaccgcct rggcttccag 1080 

ccccaggaca tcaccatcra cgcccggtcc atcggcggct ccaccgccac gtgggcagcc 1140 

atgtcccacc cagatgctag tgccatgacc ccggangccc cccccgacga cctiggcgccc 1200 

ttggccccga aggtcatgcc agacagctgg aggggcctgg tgaccaggac cgtgaggcag 1260 

catctcaatc taaacaacgc ggagcagctg tgcagacacc agggt:cc::gt accgctgatc 1320 

cggagaacca aggatgagat: catcaccacc acggcccctg aggacatcat gtccaaccga 1380 

ggcaacgacc tcctgctgaa gctcccgcag catcggtatc cccgggcgat ggcagaggag 1440 

ggtcttcgag tggcgaggca gcggctggag gccccctcac agccggagga agcctcaact 1500 

tatagccgat gggaggtgga agaggaccgg tgtctgtctg ccccccgctc ctaccaggca 1560 

gaacacgggc ccgacttccc ctggagcgcg ggggaggaca cgagtgcaga tggacggcgg 162 0 

cagctggctt tgtttctggc ccggaagcat ctgcacaact ttgaggccac tcactgcacc 1680 

ccactcccag cccagaacct: ccagacgccc tggcacccct agggaccaac tgggactcac 1740 
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catggaagaa tggggtgaga ggagacacga ggaaagaccc tcutatctgt gatcctccgc 1800 

gttcacgctg ccgtttatag tttgcggaaa gcgggggacc acccccccuc ccaccactgc 1860 

tcctcttgca cgcttcccct catucacgtg gctgtaccta accztczcca acauacatcc 1920 

ngcattacat gaatggacta tccccaataa ttaacaaaaa ggcacctttt ctacaaaaaa 1980 

aaaaaaaaaa aaactccana atcacttcct aaaaccggcc cgcgggnccc atcc 2034 



<210> 84 
<211> 2240 
<212> DNA 

<213> Homo sapiens 



<400> 84 

tcgaccacgc gtccgcttcc tcatgctgcc cgctcgccca ccgttcaggc tgctgagcct 60 

tttccttcgt ggatccgctc ccacggcagc gcgccatggc ccccgggagc cgctcctgga 120 

gaggaggcgc gctgccgcct cctccttcca gcacccatcg agtctgggac gcgagcttcc 180 

ttatgacccc gtggacacgg agggctttgg agaaggtggt gacatgcagg agcgttttct 240 

gttcccggag tacatcctgg atccggagcc gcaacccacc cgcgaaaagc agctgcagga 300 

gctccagcaa cagcaggagg aggaggagcg acagaggcag cagcggcggg aggagcggcg 360 

acagcaaaac ccacgggcca ggccccggga gcacccggtc gcggggcacc cggacccggc 420 

actgccgccc agcggcgtga actgczcggg ccgcggggca gagctgcacc gccaggacgc 480 

cggagtgccc ggctacccgc cccgagagaa gttcccccgc acggcggagg cagacggcgg 540 

gctggcacgg accgtgtgcc agcgccgccg gctgctgtcg caccaccggc gcgctctacg 600 

cctgcaggtg agccgcgagc agtacctgga gccggtgagc gccgcgctgc ggcggcccgg 660 

cccctccccg gtgctctaca tggcggacct gccggacccg cccgacgccc tgctgcccga 720 

cttgcccgcg ctggtgggcc ccaagcagcc gatcgcgccg ggaaacaaag cggacctcct 780 

gccccaggat gcccctggct accggcagag gctgcgggag cgactgcggg aggaccgtgc 840 

ccgcgccggg ctcctgccgg cccctggcca ccaagggcca cagcgccccg tcaaggacga 900 

gccacaggac ggggagaatc cgaatccgcc gaactggtcc cgcacagtgg tcagggacgt 960 

gcggctgacc agcgccaaga ccggccatgg agtggaagag ccgatccctg cccttcagcg 1020 

ctcctggcgc taccgtgggg acgtccactc agtgggcgcc accaacgccg gcaaatccac 1080 

tctctttaac acgctcctgg agtccgacta ctgcaccgcc aagggctccg acgccatcga 1140... 

cagagccacc atctcccctt ggccaggtac tacattaaac ctrcrgaagc ttcccatttg 1200 

caacccaacc ccttacagaa tgtttaaaag gcatcaaaga crtaaaaaag atccaactca 1260 

agctgaagaa gatcttagtg agcaagaaca aaatcagctt aacgtcctca aaaagcatgg 1320 

ttatgtcgta ggaagagttg gaaggacacc cttigtattca gaagaacaga aggataacat 1380 

tcccttcgag cctgangccg atccacttgc ctttgacarg gaaaacgacc ctgttatggg 1440 

cacacacaaa tccaccaaac aagtiagaaci: ' gactgcacaa gacg^gaaag atgcccactg 1500 

gttccatgac acccctggaa tcacaaaaga aaattgtatt tcaaatcntc ^aacagaaaa 1560 

agaagtaaac attgctcngc caacacagcc cactgttcca agaaccCL-ig tgctcaaacc 1620 

aggaacggtt ctgctctcgg grgcratagg ccgcacagat rzcctgcagg gaaatcagcc 1680 

agcttggcrt acagtcgtgg cttccaacac cccccctgtg cacaccacct ccrcggacag 1740 

ggcagacgct ctgcatcaga agcacgcagg ccatacgtca ccccagattc caatgggtgg 1800 

aaaagaacga atggcaggac ctccccccct tgttgccgaa gacatcacgc caaaagaagg 1860 

actgggggca tctgaagcag tggccgacat caagtttccc cctgcaggtc gggtntcagt 1920 

aacacctaat ttcaaggaca gactgcacct ccgaggctiat: acacctgaag gaacagttct 1980 

gaccgtccgg ccccctctct tgccatacac cgtcaacacc aaaggacagc gcatcaagaa 2040 

aagcgtggcc tataaaacca agaagcctcc ztccctzazg cacaacgcga ggaagaagaa 2100 

aggaaagaca aatgtatgag accgacctirg tccactccag auactaaccg caccgaacac 2160 

aacaaaacac actgaatttg tatcaaacac acaacgcata aacaaagctc ccattcttac 2220 
ccttaaaaaa aaaaaaaaaa • 2240 



<210>- 85 
<211> 1488 
<212> DNA 

<213> Homo sapiens 



<400> 85 
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cgccaagttc ccggagggag agggtagaaa ccggaggggg tggacccgtc acccacggga 60 

ctgagggtcc ctttctcccg cccccaggag gaacgagaac gaacacgacc caageccggg 120 

ttctggtggc tgcagnggcg gggtcggcgg ccgccctgct ccacgcczcc atccacaaga 180 

ctgaggaggg ccacctggct gtgtactaca ggggaggagc tccacuaacc agccccagtg 240 

gaccaggcta tcatatcatg ccgcctttca ccactacgtt cagatctgtg cagacaacac 300 

cacaaactga tgaagctaaa aatgcgcccc gtggaacaag tggtggggtc atgatctata 360 

ttgaccgaac agaagtggtt aatatgttgg ctccctatgc agcgtttgat atcgtgagga 420 

actatactgc agattacgac aagaccttaa tcttcaacaa aatccaccat gagccgaacc 480 

agttctgcag tgcccacaca cttcaggaag tttacattga actgtttgat caaatagacg 540 

aaaacctgaa gcaagccccg cagaaagacc taaacctcat ggccccaggt. ctcactatac 600 

aggctgtgcg tgttacaaaa cccaaaatcc cagaagccac aagaagaaar. tttgagtcaa 660 

tggaggctga gaagacaaaa ctccttatag ccgcacagaa ^acaaaaggtt gcggaaaaag 720 

aagctgagac agagaggaaa aaggcagcca tagaagcaga gaagaccgca caagcggcaa 780 

aaattcggtt tcagcagaaa gtgatggaaa aagaaactga aaagcgcact tctgaaatcg 840 

aagatgctgc attcctggcc cgagagaaag cgaaagcaga tgctgaacat tatgctgcac 900 

acaaatatgc cacctcaaac aagcacaagt cgaccccgga acatccggag ctcaaaaagt 960 

accaggccat tgcttccaac agtaagatct: attttggcag caacatccct aacacgctcg 1020 

tggactcctc atgcgctttg aaatactcag acattaggac tggaagagaa agctcactcc 1080 

cctctaagga ggctcttgaa ccctctggag agaacgtcat ccaaaacaaa gagagcacag 1140 

gttgatgcaa gaggtggaaa tgccctccat atcaagacgt ggcccaaggg gttaagtggg 1200 

aacaatcatc atacggactc ctcagactta cagagaactt acacttcacc tgcnccacct 1260 

ctcctgcgat agtcccgggt gctccaccga tcggaggaca gagccagctg cccgacacac 1320 

aaatggtctt ttcagccaca gtctcaccaa gtatcccata cqtattcctt. cctaaactgc 1380 

tactcatgaa tgaggaaagt ctgacgctaa gatactgcct gcaccggaat gttaaacact 1440 

aaatatacaa caagctgtgt tctcctaagc cgaaaaaaaa aaaaaaaa 1488 



<210> 86 
<211> 3174 
<212> 0NA 

<213> Homo sapiens 



<400> 86 

gcggacgctg grcscaaaca ctaaggcctg agcggcgaca accgaggcga gatgacggtc 60 

aacagggaat gcctcgtggg agaaaaaaga caatcttatt ctcagcgctg attctgagac 120 

gatgggctcg ggaaacgggc gccgcagcac gaagtcgccg cccctcgtgc cggccgccct 180 

ggtggcctgc atcaccgtct tgggctccaa cractggatt gcgagctccc ggagcgtigga 240 

cctccagaca cggatcatgg agctggaagg cagggtccgc aggcgggctg cagagagagg 300 

cgccgtggag ctgaagaaga acgagttcca gggagagcrg gagaagcagc gggagcagct 3 60 

tgacaaaatc cagtccagcc acaactrcca gctggagagc gicaacaagc -gtaccagga 420 

cgaaaaggcg gttttggcga acaacatcac cacaggcgag aggctcaccc gagtgctgca. 480 

agaccagtta aagaccccgc agaggaacta cggcaggcng cagcaggacg ticctccagtc 540 

tcagaagaac cagaccaacc cggagaggaa gttctcctac gacccgagcc agtgcatcaa 600 

tcagatgaag gaggtgaagg aacagcgtga ggagcgaaca gaagaggcca ccaaaaaggg 660 

gaatgaagct gtagcttcca gagacccgag tigaaaacaac gaccagagac agcagcccca 720 

agcccccagc gagccccagc ccaggctgca ggcagcaggc ctgccacaca cagaggtgcc 780 

acaagggaag ggaaacgtgc ttggtaacag caagtcccag acaccagccc ccagttccga 840 

agtggttttg gattcaaaga gacaagctga gaaagaggaa accaacgaga cccaggcggt 900 

gaatgaggag cctcagaggg acaggctgcc gcaggagcca ggccgggagc aggtggtgga 960 

agacagacct gtaggtggaa gaggcctcgg gggagccgga gaactgggcc agaccccaca 1020 

ggcgcaggct gccctgcyag cgagccagga aaaticcagag atggagggcc ccgagcgaga 1080 

ccagcttgtc atccccgacg gacaggagga ggagcaggaa gctgccgggg aagggagaaa 1140 

ccagcagaaa ctgagaggag aagatgacta caacacggat gaaaatgaag cagaacctga 1200 

gacagacaag caagcagccc tggcagggaa tgacagaaac acagacgttc tcaatgncga 1260 

agatcagaaa agagacacca taaatctacc tgatcagcgt gaaaagcgga accacacacc 1320 

ctgaattgaa ccggaatcac ataccccaca acagggccga agagatgact acaaaatgrr 1380 

cacgagggac tgaanaccga aaactgtgaa atgtactaaa taaaacgtac acccgaagat 1440 

gattattgtg aaattccagt atgcaccctg tgtaggaaaa aacggaatgg tcctttaaac 1500 

agcttttggg gggtactttg gaagtgtcta ataaggtgtc acaattitccg gtagtaggta 1560 
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tctcgtgaga agctcaacac caaaactgga acatagttcc ccctcaagtg trggcgacag 1620 

cggggccccc tgattctgga acacaacctc gcgtaaatca acagccaccc acagaagagt 1680 

ccatctgccg tgaaggagag acagagaacc ccgggctccg ccgtcccgcc cacgcgccgt 1740 

accaagtgct ggtgccagcc tgttacctgt: tcccactgaa aagtctggct aacgctcttg 1800 

tgtagtcact tccgattctg acaatcaacc aatcaatggc ctagagcact gactgctaac 1860 

acaaacgtca ctagcaaagt agcaacagct ccaagcctaa atacaaagct gccctgcgtg 1920 

agaacctttt aaaaggctac ctgcacaata acccctgcca cttttaatgt acaaaacgct 1980 

attaagtggc ttagaatttg aacactitgtg gtctttacct actttgcttc gcgtgcgggc 2040 

aaagcaacat cttccctaaa catatattac caagaaaagc aagaagcaga ttaggttttt 2100 

gacaaaacaa acaggccaaa agggggctga cctggagcag agcacggtga gaggcaaggc 2160 

atgagagggc aagtttgtcg tggacagacc tgtgcctacc ttatcaccgg agcaaaagaa 2220 

aacaaagttc atcgatgccg aaggatatat acagtgctag aaactaggac tgtttagaaa 2280 

aacaggaata caatggtcgt ttttatcata gtgcacacat ttagcttgcg gtaaatgact 2340 

cacaaaaccg attttaaaat caagtcaatg tgaattttga aaactaccac ttaatcctaa 2400 

ttcacaacaa caatggcatt aaggctcgac ctgagttggt tcttagcatt acttatggta 2460 

aacaggcccc taccacttgc aaataaccgg ccacaccatt. aatgaccgac tccccagtaa 2520 

ggctccctaa ggggtaagta ggaggatcca caggatctga gacgctaagg ccccagagat 2580 

cgtttgatcc aaccctctta ttttcagagg ggaaaatggg gcctagaagt tacagagcat 2640 

ctagctggtg cgctggcacc cctggcctca cacagacrcc cgagtagctg ggactacagg 2700 

.cacacagtca ccgaagcagg ccctgcttgc aacccacgtt gccacctcca acctaaacat 2760 

tcttcacatg tgacgtccte agccaccaag gctaaactcc cccacccaga aaaggcaact 2820 

tagataaaat cttagagcac ccncacactc CwCcaagccc ccttccagcc tcaccctgag 2880 

tcccccctgg ggttgatagg aattttciict cgcccrctca ataaagtctc cattcatccc 2940 

atgtttaatt tgtacgcaca gaatcgctga gaaacaaaac gtcctgnrca acttaaaaaa 3000 

cttgtcacag ctctgctttc ttgcttttcc cctccatcct ctactagatc cccaaagaat 3060 

ccaccctcaa accgtggact aaggatggca gtttgcccta aaccttctgt ggaatgagat 3120 

gggctgtcag gacgtcttca acccacagca ccccgaaagc cccgtgctca ccac 3174 



<210> 87 
<211> 2780 . 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (2760) 

<223> n equals a,c,g, or c 
<400> 87 

ggcgccgcgg gcggggggcc ccggggccac gtggacgccg acggctccgg cctgcggcgc 
gtggacaccg ccagcggctg cagccgacci ggaatgaccc gggagacaaa tacaacagca 
tggaagakgc caaagtctac gcggccaaag cggactgcac ggcccacccc gacgngcgct 
ccgcccaggg ggtgcgagga taccccacct taaagcctct caagccaggc caagaagctg 
tgaagtacca gggtcctcgg gacttccaga cactggaaaa ccggacgccg cagacaccga 
acgaggagcc agcgacacca gagccggaag tggaaccgcc cagcgccccc gagctcaagc 
aagggctgta cgagctctca gcaagcaacc ctgagccgca cgtcgcacaa ggcgaccact 
ctatcaagtt cttcgctccg tiggtgcggcc actgcaaagc cctggctcca acctgggagc 
agctggccct gggccctgaa cactccgaaa ccgccaagac tggcaaggtu gaccgtacac 
agcactacga accctgcccc ggaaaccagg tccgtggcta ccccactcct ctcrggttcc 
gagatgggaa aaaggtggac cagtacaagg gaaagcggga tctggagcca ctgagggagt 
^cgtggagtc gcagctgcag cgcacagaga ccggagcgac ggagaccgtc acgccctcag 
aggccccggt gctggcagct gagcccgagg ctgacaaggg caccgtgccg gcacccaccg 
aaaacaaccc cgacgacacc actgcagaag gaataacctt caccaagttc tatgctccac 
ggtgtgguca ctgtaagacc ctggctccta cttgggagga actctctaaa aaggaattcc 
ctggtctggc gggggtcaag atcgccgaag cagactgcac cgctgaacgg aatacccgca 
gcaagtattc ggtacgaggc caccccacgt tactgctctt ccgaggaggg aagaaagtca 
gtgagcacag tggaggcaga gacctcgacc cgtcacaccg ctttgtccng agccaagcga 
aagacgaact ttaggaacac agtcggaggc cacctcnccc gcccagctcc cgcaccctgc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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gtttaggagt tcagtcccac agaggccact gggttcccag tggcggctgt tcagaaagca 1200 

gaacatacta agcgcgaggt accttcttcg tgtgtgcgtt ttccaagcca acacactcca 1260 

cagatccttt attaagttaa gtttccctaa gcaaatgcgt aactcacggt cactgtgtaa 1320 

acattttcag tggcgatata tcccctctga ccttctctcg atgaaattta catggtttcc 1380 

tctgagacca aaatagcgtt gagggaaacg aaactgctgg actatttgtg gctcccgagt 1440 

cgagtgattc tggtgaaaga aagcacatcc aaagcatagt ttacctgccc acgagttctg 1500 

gaaaggtggc cttgtggcag taccgacgtt cctctgatct caaggtcaca gttgactcaa 1560 

tactgtgtcg gtccgtagca cggagcagat tgaaacgcaa aaacccacac ctctggaaga 1620 

caccttcacg gccgctgccg gagcttctgt tgctgtgaat acttctctca gtgtgagagg 1680 

ttagccgtga tgaaagcagc gctacctctg accgcgcctg agtaagagaa tgctgatgcc 1740 

ataactttat gcgtcgatac tcgtcaaatc agttactgtt caggggatcc ttctgtttct 1800 

cacggggtga aacacgtcct tagctcctca tgttaacacg aagccagagc ccacatgaac 1860 

tgttggatgt cttccttaga aagggtaggc atggaaaatf ccacgaggct cattctcagt 1920 

atctcactaa ctcattgaaa gattccagtc gtatttgtca cctggggtga caagaccaga 1980 

caggcttccc caggcctggg tatccaggga ggctctgcag ccccgctgaa gggccctaac 2040 

tagagttcta gagtttctga ttctgtttct cagtagtcct tttagaggct tgctatactt 2100 

ggtctgcttc aaggaggtcg accttctaat gtatgaagaa tgggatgcat ttgatctcaa 2160 

gaccaaagac agatgtcagt gggctgctct ggccctggtg cgcacggctg tggcagctgt 2220 

tgatgccagt gtcctctaac tcatgctgtc cttgtgatta aacacctcta tctcccttgg 2280 

gaataagcac atacaggctt aagctctaag ataggtgttt gtccttttac catcgagcta 2340 

cttcccacaa taaccactnt gcatccaaca ctcttcaccc acctcccata cgcaagggga 2400 

tgtggatact tggcccaaag taaccggtgg taggaatctt agaaacaaga ccacttatac 2460 

cgtctgccLy aggcagaaga taacagcagc atctcgacca gcctctgcct ta^iaggaaat 2520 

ctttattaat cacgtatggt tcacagataa ttcttttttt aaaaaaaccc aacctcctag 2580 

agaagcacaa ctgtcaagag tcttgtacac acaacttcag ctttgcatca cgagtcttgt 2640 

attccaagaa aatcaaagtg gtacaatttg tttgtttaca ctatgatact ttctaaataa 2700 

actctttttt tttaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaggggg gggggggggn 2760 

ggggcccccc cccccccaaa 2780 



<210> 88 
<211> 1061 
<212> DNA 

<213> Homo sapiens. 



<400> 88 

aattcggcac gagagaagga aatacatcaa aa::gcccaca ttggttatcc g::agaggata ' 60 

gaatgaaaga tggctttatc ctct;:gtttg ctcttactaa agcaatcaga tggngcttct 120 

cctgtactca gagccctggc tgctccccgc ctggccrctc ccgcgggctg ccgtggaacc 180 

agaaaagcct taaacggaaa tgtgggagag aaggttggat tcactttcat gtctttccag 240 

ggttgtgacc cctcaagtcc tggttgcctt tgctgttctc tattaccttc aaacagccag 300 

ctcgtcttta tttctttttt agttttgtcg gggttggctt gatagatgtt agcccatcat 360 

agccagatgc gtctagcctt gtcttttgaa tgcaagattc aggatgtggg tacttagctg 420 

ttagtggaca tcagagtcac tagtcaggat gaaagagttc ttggcttcaa ctcccagaaa 480 

ttctggtaac gtcatgtata gtgacggccg catgtctaac aggtggccag gtaagtcttt 540 

tggggtggtc tgtgaatcac agttcgggag acattgactt tcagggagtt tgttctgaat 600 

tcactagata atagagatat aatacagagc tttgaaagct ggtgtcttga tgacagagcc 660 

gtggcaatgg ggagggttga ggaggtggct gttgggcctg tctcctggtg agagttgaaa 720 

gggcctgaac tcaagcagag gcctcagaac cgaaaggtgg tggaaggatg cagcaagagg 780 

cgccacacag gagtactctg cgccctggca gggtctgaat acacgtggga gtggtgagag 840 

ggagaacttt aagtccaggt tttgcgcctc agcgacttag tgtggccaca tcattagaaa 900 

tgtgttgagg ccgggcacag tggctcatgt ctgtaatccc agcaccttga gaggctgagg 960 

caggaggacg gcttgaggcc aggagtttaa aaccagcctg gacaacatag tgagagcctg 1020 

tctctacaaa aaaaaaaaaa aaaaactcga gggggggccc g 1061 



<210> 89 
<211> 1342 
<212> DNA 
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<213> Homo sapiens 



<400> 89 

ccacgcgtcc gggcgcgcgg gtgaaaggcg cactgacgca gcctgcggcg gcctcggagc 60 

gcggcggagc agacgctgac cacgttcccc tccccggtct cctccgcctc cagctccgcg 120 

ctgcccggca gccgggagcc atgcgacccc agggccccgc cgcctccccg cagcggctcc 180 

gcggcctcct gctgctcctg ctgctgcagc tgcccgcgcc gtcgagcgcc tctgagaccc 240 

ccaaggggaa gcaaaaggcg cagccccggc agagggaggt ggtggacctg tacaatggaa 3 00 

tgtgcttaca agggccagca ggagcgcccg gtcgagacgg gagccctggg gccaacggca 360 

ttccgggtac acctgggatc ccaggccggg acggattcaa aggagaaaag ggggaatgtc 420 

tgagggaaag ctttgaggag tcctggacac ccaactacaa gcagtgttca tggagctcat 480 

tgaattacgg catagatctt gggaaaattg cggagtgtac atttacaaag atgcgttcaa 540 

atagtgctct aagagttttg ttcagtggct: cacttcggct aaaacgcaga aatgcacgct 600 

gtgagcgttg gtatttcaca ttcaatggag ctrgaatgttc aggacctctt cccattgaag 660 

ctataattta tttggaccaa ggaagccccg aaatgaattc aacaattaat attcatcgca 720 

cttcttctgt ggaaggactt tgtgaaggaa ctggtgccgg attagtggat gttgccatct 780 

gggttggcac ttgttcagat tacccaaaag gagacgcttc tactggatgg . aattcagttt 840 

ctcgcaccat tatcgaagaa ctaccaaaac aaatgctcta attttcattt gctacctctt 900 

tttttattat gccctggaac ggtccaccta aatgacatct taaacaagct tatgcataca 960 

tctgaatgaa aagcaaagct aaatatgttt acagaccaaa gtgtgatttc acactgtttt 1020 

taaatctagc attattcatt ttgcttcaat caaaagtggt cccaatattt tttttagttg 1080 

gttagaacac tttctccata gtcacattct ctcaacccat aatctggaat attgttgtgg 1140 

ccctctgttt tttctcttag tacagcatrt ttaaaaaaac ataaaagcta ccaatcttcg 1200 

tacaatttgc aaacgctaag aacttttttt atatctgtta aacaaaaatt acccccaaaa 1260 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1320 

aaaaaaaaaa aaaaaaaaaa aa 1342 



<210> 90 

<211> 770 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (690) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 

<222> (762) 

<223> n equals a,t,g, or c 



<400> 90 

ggcagggtcg ggkcgaccyr cgcgcccggc ccttcctcag gcggcggcca tggcgggaca 60 

ggaggatccg gtgcagcggg agactcacca ggactgggcc aaccgggagt acattgagat 120 

aatcaccagc agcatcaaga aaaccgcaga cccccccaac ccgtccgaca tgtctcgccg 180 

ttcaagacct gcaacaccaa acgagaaatc gacagcccct gaacggagaa tagagcacat 240 

cgaagctcgg gcgacaaaag gcgagacacc cacctagaac agtgccgtigc tgctgccggg 300 

aagtcgcLtc acacaacaca ggccacacgg gaaaggcccc agcagcctrc agccccttcc 360 

tttcccctta aagagcaaca gggctcaccc ccgttcctct cccctcaaaa gtgtggcccc 420 

tgggctctgc catctggggt gtggtgcggt acgcgggaag aagtccagag gaaccgttgg 480 

aaacgacgtt aggcatctta ccctttcagc aacacttcat acatctaccc gccaatgtat 540 

ttgagacatt cacagccaaa agcctgggac cctttgtgaa ggtcctcctc amctctatct 600 

ttctttcrct ctctcrcaaa ctccccctaa agttctcact gcccctgcac tgcttctgtg 660 

aacagccctt gtctcctccc cacctccggn ggaaagtgcg gggcagtcct ggtcaagaca 720 

cccatgcccc ggcaacgtgg cccgcagaga acgctgtcgt anccaccagc 770 
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<210> 91 
<211> 1570 
<212> DNA 

<213> Homo sapiens 
<400> 91 

gatggcttta ctgaagttgc taaaagctta cagaaaagga agtgcaggaa catttcacaa 60 

atccacaatc tgtgagtacc acatccrgta tagctgtmaa cactggaata aggaagggct 120 

gacgactccc agaagatgaa ggtaagcaga aaccgttgat gggaccgaga aaccagagtt 180 

aaaacctctt tggagcttct gaggactcag ccggaaccaa cgggcacagt tggcaacacc 240 

accatgacat caca'acctgc tcccaacgag accaccatag tgctcccacc aaatgccatc 300 

aacttctccc aagcagagaa acccgaaccc accaaccagg ggcaggacag cctgaagaaa 360 

catctacacg cagaaatcaa agttattggg actatccaga tcttgtgtgg catgatggta 420 

tcgagcttgg ggatcatttt: ggcatccgct tcctcctctc caaattttac ccaagtgact 480 

tctacaccgt tgaactctgc ccacccattc ataggaccct ttttttttat catctctggc 540 

tctctaccaa tcgccacaga gaaaaggtcr accaagcctt cggtgcacag cagcctggtt 600 

ggaagcattc cgagtgctcc gtctgccccg gtgggcttca ccaccctgtic tgtcaaacag 660 

gccaccttaa accctgcctc accgcagtgt gagctggaca aaaacaatat accaacaaga 720 

agttatgttt ctcactttta tcatgatcca ctttatacca cggaccgcta tacagccaaa 780 

gccagtctgg ctggawcccr ctctctgacg ccgatctgca ccccgctgga actccgccta 840 

gctgtgccca ctgctgcgct gcggcggaaa caggcctact ccgacttccc cgggagtgca 900 

cttttcctgc cccacagtca cattggcaat tccggcatgc cctcaaaaat gacccacgac 960 

t.gtggar.a r_g aagaactatt gacttctcaa gaaaaaaggg agaaacarza a^cagaaagt 1020 

tgattcttac gataatatgg aaaagctaac cattatagaa aagcaaagcc cgagtctcct 1080 

aaatgtaagc tttcaaagta acgaacatta aaaaaaacca ctattccacc gccattcaag 1140 

atatgtgccc attggggatc tcttgactcg cc^gacaccg acttcagcaa aagcacgggg 1200 

ctgtaaatta ccatttacta gattagccaa acagtcngaa trtccagaaa acaaggcaga 1260 

atgatcattc ccagaaacat ttcccagaaa acgcttccca gaaaactaga cagmatgatc 1320 

attcaacgga tcacagtgaa gcaaaggaca caacttctca ccgtacccct taatcgtcaa 1380 

caggagtcaa ctgatttgtt gtggtgccca gactttttta cacaggtgct agcgttttat 1440 

cctatgtatt ttaactcatt agcgcataaa ggcaagcccc atataacgaa gtctcagggt 1500 

atatgaaagt agctggcttc aaaacaaaat tttcgagtgc aaaaaaaaaa aaaaataaaa 1560 

aaaaaaaaaa 1570 



<210> 92 

<211> 2950 

<212> DNA 

<213> Homo sapiens 



<400> 92 



cccggctccc gcccgctccc agccgggccc cccagcggtc ggcgggacgg crcccggctg 
cagtctgccc gcccgccccg cgcgggggcc gagtcgcgaa gcgcgcc-gc gacccggcgc 
ccgggcgcgc tggagaggac gcgaggagcc atgaggcgcc agctgcgaag gtggcggcgc 
tgctgcccgg gccgcccttg gagtgcacag aagccaaaaa gcactgccgg tatttcgaag 
gactctatcc aacctattaC atatgccgct cccacgagga ctgctgcggc cccaggcgct 
gtgtgcgggc cctccccaca cagaggccgt ggcacccccg gccccttccg atga::gggcg 
tgcctttctg ctgcggagcc ggcttcttca tccggaggcg catgtacccc ccgccgccga 
tcgaggagcc agccttcaac gtgtcctaca ccaggcagcc cccaaatccc gpcccaggag 
cccagcagcc ggggccgccc tatcacacyg acccaggagg accggggatg aaccccgccg 
ggaattccac .ggcaacggct ttccaggrcc cacccaactc accccagggg agcgtggcct 
gcccgccccc cccagcctac cgcaacacgc ccccgccccc gtacgaacag gcagtgaagg 
ccaagtagtg gggtgcccac gcgcaagagg agagacagga gagggccttit ccccggccct 
tctgtcttcg ccgatgttca cccccaggaa cggtctcgtg ggctgccaag ggcagztcct 
ctgatatcct cacagcaagc acagctcccc ttcaggcttt ccatggagca caatatatga 
actcacactc cgtctcct:c~ gctgcttctg tccccgacgc acctgtgccc tcacacggta 
gcgtggtgac agLccccgag ggccgacgtc cttacggtgg cgtgaccaga tccacaggag 
agagactgag aggaagaagg cagtgctgga ggtgcaggcg gcatgcagag gggccaggcc 
gagcacccca ggcaagcar.c czzctgcccg ggcactaana ggaagcccca tgccgggcgg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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ctcagccgac gaagcagcag ccgaccgagc ngagcccagc aggtcacctg ctccagccrg 1140 
tcctctcgtc agccttcccc ctccagaagc cgctggagag acattcagga gagagcaagc 1200 
cccttgtcat gcccctgcct ctgttcatat cctaaagata gacctctcct gcaccgccag 1260 
gaaagggtag cacgtgcagc tctcaccgca gacggggcct agaatcaggc ctgcttggag 1320 
gcctgacagc gacctgacat ccactaagca aacctattta aatccatggg aaaccacctc 13 80 

ctgccccaaa ctgagacatt gcattttgtg agctcttggt ccgatttgga gaaaggactg ^ 1440 
ttacccactt ttttggcgtg tctatggaag cgcacgtaga gcgtcccgcc ctttgaaatc 1500 
agactgggtg tgtgccttcc ctggacatca ccgcctctcc agggcattct caggcccggg 1560 

ggtccccttc ccccaggcag ctccagtggt gggttctgaa gggcgctttc aaaacggggc 1620 

acatctggct gggaagtcac atggactctt ccagggagag agaccagctg aggcgcccct 1680 

ctctgaggtt gtgttgggcc taagcgggcg cgcgctgggc tccaaggagg aggagcttgc 1740 

tgggaaaaga caggagaagc actgactcaa ctgcactgac catgttgcca taattagaac 1800 

aaagaagaag tggtcggaaa tgcacattcc tggataggaa tcacagctca ccccaggatc 1860 

ccacaggtag cctcctgagc agctgacggc tagcggggag ctagttccgc cgcatagtta 1920 

tagtgctgat gtgtgaacgc tgacccgccc tgtgtgctaa gagctacgca gcttagctga 1980 

ggcgcctaga ttactagatg tgctgtacca cggggaatga ggtgggggcg cttatctttt 2040 

aatgaactaa tcagagcccc ttgagaaatc gtcactcatt gaactggagc atcaagacat 2100 

ctcatggaag tggatacgga gtgatttggc gtccatgctt ttcactctga ggacatttaa 2160 

tcggagaacc tcctggggaa ttttgcggga gacacttggg aacaaaacag acaccctggg 2220 

aatgcagtcg caagcacaga tgctgccacc agtgtctctg accaccctgg tgtgactgct 2280 

gaccgccagc gtggtacctc ccacgcwgca ggcccccatc caaacgagac aacaaagcac 2340 

aacgttcact gcttacaacc aagacaactg cgcgggccca aacactcctc ttcctccagg 2400 

tcatttgttt tgcattttta atgtctttiat: nt: tttgcaac gaaaaagcac acraagctgc 2460 

ccctggaatc gggcgcagcc gaataggcac ccaaaagtcc gcgaccaaat cccgt-ttgcc 2520 • 

ttcttgatag caaattatgc taagagacag tgacggccag ggctcaacaa ttttgcattc 2580 

ccatgtttgt gtgagacaga gcctgttttc ccccgaactc ggttagaact gtgctactgt 2640 

gaacgctgat cctgcatatg gaagtcccrc ttcggcgaca cttcctggcc attcttgttt 2700^..-. 

ccattgtgtg gatggtgggt tgtgcccact ccctggagtg agacagctcc cggtgcgtag 2760 

aattcccgga gcgtccgtgg ctcagagtaa acttgaagca gatctgcgca tgcttttcct 2820. 

ctgcaacaat tggctcgttc ctcttcttcg ttcccttttg acaggacccc gtttcctatg 2880 

tgtgcaaaac aaaaataaac ccgggcaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2940 

aaaaaaaaag 2950 



<210> 93 

<211> 1722 

<212> DNA 

<213> Homo sapiens 



<400> 93 

ggcacgagcc agagcaggct gc::aggcctg gggccaccac cgcccctggg tgctacaccc 60 

agtgtgccgg grcactggga acccccngaa g-ggcgccac ccgaaccggg cccccaagga 120 

tggggtgcgg gcagtaccgc aggaagagga gcagccccrg tgaagaccga gagctgccag 180 

aggctccgtg atcggccgcg gcacgacgac ccgcgcacgg attggctgct ccgggccggg 240 

gggccgggcc cgggggacag aatccgcccc cgaaccttca aagagggcac cccccggcag 300 

gagctggcag accyaggagg cgcgacagac ccgcggggca aacggactgg ggccaagagc 360 

cgggagcgcg ggcgcaaagg caccagggcc cgcccagggc gccgcgcagc acggccctgg 420 

gggttccgcg ggccttcggg rgcgcgcctc gcczctaccc atggggtccg cagcgttgga 480 

gatcctgggc ccggtgccgt gcccggcggg ctgggggggc ccgatccrgg cgcgcgggct 540 

gcccatgtgg caggtgaccg ccttcccgga ccacaacatc gcgacggcgc agaccacctg 600 

gaaggggccg cggacgtcgt: gcgcggtgca gagcacsggg cacatgcagt gcaaagtgta 660 

cgactcggtg ctggctctga gcaccgaggt gcaggcggcg cgggcgctca ccgcgagcgc 720 

cgtgctgctg gcgctcgtcg cgctcttcgc gaccctggcg ggcgcgcagc gcaccacctg 780 

cgtggccccg ggcccggcca argcgcgcgt ggcccrcacg ggaggcgrgc tccacctgtc 840 

ttgcgggctg ctggcgcccg cgccactictg ctggttcgcc aacattgtcg tccgcgagtt 900 

ttacgacccg tictgtgcccg . tgtcgcagaa gcacgagctg ggcgcagcgc tgtacatcgg 960 

cngggcggcc accgcgccgc tcatggtagg cggccgcccc ttgcgctgcg gcgcctgggt • 1020 

ctgcaccggc cgccccgacc tcagcttccc cgcgaagcac ccagcgccgc ggcggcccac 1080 

ggccaccggc gactacgaca agaagaacta cgtccgaggg cgctgggcac ggccgggccc 1140 
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ctcctgccag ccacgcctgc gaggcgttgg ataagcctgg ggagccccgc acggaccgcg 1200 

gctLccgccg ggcagcgcgg cgcgcaggct ccccggaacg cccggctccg cgccccgacg 1260 

cggctcctgg atccgctcct gcctgcgccc gcagctgacc ttctcctgcc actagcccgg 1320 

ccctgccctt aacagacgga atgaagtccc cttttctgtg cgcggcgccg cttccatagg 1380 

cagagcgggt gtcagactga ggatttcgct tcccctccaa gacgctgggg gtctcggctg 1440 

ctgccttact tcccagaggc ccctgctgac t:ccggagggg cggatgcaga gcccagggcc 1500 

cccaccggaa gatgtgtaca gccggccctt acrccatcgg cagggcccga gcccagggac 1560 

cagtgacttg gcctggacct cccggcctca czccagcatc tccccaggca aggctcgtgg 1620 

gcaccggagc" ttgagagagg gcgggagrgg gaaggctaag aatccgctta gtaaacggtc 1680 

tgaaccctca aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 1722 



<210> 94 

<211> 635 

<212> DNA 

<213> Homo sapiens 

<400> 94 

gcctaaagag agctccccca ggaccagccc cggccaaggg atzgccgcag ccctcatcca 60 

cctcccaagc actggaaaca aacattggag accaagcgag gcgtcactca acagccgcag 120 

taatcaggga aatgacaagt cacacaccga ca^cccrtgc tctgctgatt ggagccgggc 180 

gcactgaaaa agatcagtcg cgcccagtct ctgggggaag gaagcgtctt cacctgttgc 240 

ttgtgggagg acagttgagg caggtgagga cgctgagagg ^gagccc^gr ngngcctgtt 300 

accgtccaca tgtgcaagcc cttcagcccg gcggtcgtac ttgtticcga gacgcagttt 360 

cactcttgtc acccaggctg gagtgcatgg catgatcttg gcccgctgca acacccgcct 420 

cccgggttca agcgattctc ctgtc-ccac caaaaacaca aaaaccagct gggcgcggtg 480 

gtgcgtgcct ttaatcccag ctacccagaa ggcTigaggtg caagaactgc ttgaacctgg 540 

gaggtggagg ttgccgtggg ccgagaccac gccaccgcac tccagcctag gcaacagagc 600 

tagaccgcct caaaaaaaaa aaaatgaccc tcgag 635 



<210> 95 

<211> 3798 

<212> DNA 

<213> Homo sapiens 



<400i> 95 

ccacgcgtcc ggggcttcat acaggaaarc rarcgccgrg craagctcca gagaaaagct: 60 

cctgttcgcc caagctacta accaggccaa accaca::aga cgcgaaggaa gcggccagaa 120 

ggaagggagn gccccactgc tgacggggza agaggatccr ciaccgagaa gctgaccaga 180 

gagggtctca ccatgcgcac agc::ccrrcc g-accagtgt ggaggaaaag cacrgagtga 240 

agggcagaaa aagagaaaac agaaacgctc cgccctcgga gaactgccaa cc::agggcta 300 

ctgttgactt tgactatctc ctcagtggcc gaagcggagg g-gccgccca accaaacaac 360 

tcatcaacgc cgcaaactag caaggagaac catgctccag ct-caagcag rttatgtacg 420 

gatgaaaaac agatcacaca gaaccacccg aaagcactcg cagaagrcaa caccccatgg 480 

cctgtaaaga cggctacaaa tgccgtgci:: zgzzgccctc cza^zcgcazz aagaaacttg 540 

atcacaacaa catgggaaat aaccctgaga ggccagcctt ccigcacaaa agcccacaag 600 

aaagaaacaa acgagaccaa ggaaaccaac cgraccgacg acagaataac ctgggwCCcc 660 

agacccgacc agaattcgga cczzcagazz cgzaccgzgc ccatcaczca cgacgggrat 720 

tacagacgca caacggtaac acctgacggg aat"tcca-G g::gcacatca ccticcaagtg 780 

tnagctacac ccgaagtgac cctgtZLcaa aacaggaata gaactgcagt acgcaaggca 840 

gctgcaggga agccagctgc gcacatctcc rgga::cccag aggccgattrg cgccactaag 900 

caagaatact ggagcaatgg cacagcgacr gtcaagagca ca!:gccaci:g ggagqcccac 960 

aatgtgtcca ccgtgaaccg ccacgzczcc car-"gactg ccaacaagag tctgcacaca 1020 

gagctactcc ccgctccagg cgccaaaaaa tcag.caaaat: cacatattcc atacatcatc 1080 

cttactacca ttatctcgac catcgcggga cccacicggt tcrcgaaagt caacggctgc 1140 

agaaaacaca aactgaacaa aacagaatct acrccagtcg ccgaggagga tgaaacgcag 1200 

ccctatgcca gccacacaga gaagaacaa" cccctiCLatg acaccacaaa caaggcgaag 1260 

gcatctgagg cattacaaag tgaagccgac acagaccccc atactccata agtcgcrgga 1320 
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ccctagtacc 

cagttctaga 

taatggattc 

ctagttctcc 

aaatcgtgca 

agcaagtgtg 

tgtatatgcc 

ccttgtcaag 

taataatata 

ttaatggtta 

aatgatggat 

ataaacaata 

gtaggataac 

tatatatgtt 

gcctttccct 

tcccttttta 

aanaaaaatg 

cattagcttc 

gttttgcaga 

tacctactga 

acacggggta 

atgaccctcc 

gttaatatca 

aaacatgaga 

cacaccatct 

agacatttca 

aagctaatct 

aaagcagtgt 

tcattacacc 

atagtcaggt 

tgcccacctc 

cttcaagata 

ctactatggg 

ccaaatccac 

aaggtcatcc 

tacattittac 

cgnatcttct 

ctttcaacac 

tgcttcactt 

cacagtttgt 

tttacatctg 

aaaaaaaaaa 



aagaaacaac 

acgaagactt 

tcattcatac 

gaagaactga 

ggacagtaaa 

aacagtttct 

catggcatat 

gcaccagtgg 

agtcccagag 

tacaaaattc 

atctagattc 

ccacatagaa 

tttctcgagt 

aaac-tgtgtc 

aaacttctat 

accaaccaga 

gatutatatt 

ttccgaactc 

Cgcgctacat 

acaantaacg 

tttacttaca 

cctccatatt 

tgaaattatt 

agttrtttca 

gagactttca 

gttaccatgg 

aatgacactt 

ccactaggtt 

gcccttgaaa 

tcaaigactac 

tcaacntgac 

atttacagaa 

gccggaacat 

gcnagacccg 

aggaagcaga 

aaacaacacc 

cccrtgcttc 

cccuaacaat 

acaaacetcc 

taaatrtrgc 

tgaaaccacc 

aaaaaaaa 



aacaaacgag 

aratrgaaac 

ccctgcacaa 

tgttattaca 

caatgaaaac 

tgcgtatcct 

gcgccaattc 

gaacaataca 

atcatttcac 

cattgtatag 

ttttaagttt 

tgcttccngt 

ggaattgcca 

aaacatatat 

tctaataact 

ttaaacatga 

tctggtcatg 

arratatctc 

cggaautgcg 

zcnazzgacc 

aactatgagc 

cctgcggagc 

ttcaaccaaa 

tca::gacaga 

agacagctac 

attttgtcct 

ggcacatgat 

ccrgcacaca 

tctatcaatc 

ctgrititaaat 

aatgccaaaa 

gctaaatitac 

ggtcgaacat 

ccaaactaga 

ggccgagacg 

tracacgtct 

gczagacctg 

c tgaaag::aa 

gaaaacggct 

acacact caa 

aacataacca 



acacattata 

taggtttccc 

ttggaatttc 

aagaaaatac 

caaatttcct 

tccagaatat 

atgtgtcccc 

ccgcattact 

atcatgacaa 

ttatatcatt 

tgtttatttc 

gcatatatct 

ggtcaaaggg 

gccaaattcc 

gtactatccc 

cgcgagacta 

Ctcgtaagag 

cacagaggcg 

tatctttazg 

acataagtgg 

tacacaacaa 

aaaacattgg 

aacataagcc 

aattgaaaca 

ttcgtcttaL 

caaagggacc 

attccgaLca 

aatatcacca 

ctccatccaa 

attactgaaa 

caggcatgtit 

actgnacaga 

liagaacgaca 

gaacctagag 

gagc'aggtg 

camac-gr 

ccaac t:t: tec 

tccatcacaa 

ccactigagar 

aa wgcgg'acc 

aga ^aaLaaa 



attactgtct 

aaggttctta 

tgattctcag 

atgcccacga 

caagaaacaa 

tttaatgtac 

ttacatatac 

gttctataca 

gtagagctac 

atttaattaa 

tttLaagttt 

ctttgttttt 

tttgtgcatt 

ctccaacatt 

tgcttctaca 

taacaagaat 

agtgaatgca 

ctgatacrtg 

gtgtacattc 

cgaaaaacgc 

gcaagtggcc 

caatgctcat 

tatttgctcc 

aaacacac tc 

aagcatattt 

tttagtctat 

agccattttg 

agcaatgtcc 

ccccggttga 

aacaaagcaa 

aaatgacatc 

aaataattgt 

taaaaaat ta 

atcagacctg 

cgaCwacrica 

CCZtLCtCCC 

ctctccttct 
ccaaatacct 
ataat cgata 

ttggcaaatg 
cccgtccatc 



gatcctccta 

gaagacactt 

ccgccaccag 

ccaaatatcc 

ccgaagaagg 

atatgacatg 

acgcacatat 

tatgaaaacc 

ctcattcttt 

aaacaaccct 

tgtttgtggt 

gagtatatct 

ttactatcga 

gttcaaatgt 

gctgccactt 

tatactattt 

cgtgngagaa 

acgcctaaca 

cactgtgata 

accacagagg 

acgggacggc 

gcaaatcatc 

acagcagaaa 

atccctcaat 

ttctccctct 

tccggatgta 

actcgaccaa 

acaatagaca 

gctgaggctc 

gacagtacta 

atagaaaaga 

acgaaaatct 

catacattct 

gcgcgrcagc 

catagtcgat 

catcccattt 

gLCCctctct 

actggggtta 

tawtcaagtg 

ctgacacagt 

accccccaaa 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
- 2880 

--•2940 
3000 

3060 

3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
366C 
3720 
3780 
3798 



<210> 96 

<211> 2683 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> SITE 

<222> (2640) 

<223> n equals a,c,g, or c 



<220> 

<221> SITE 

<222> (2676) 

<223> n equals a,t,g,or c 
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<210> 97 
<211> 2181 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (5) 

<223> n equals a,t,g, or c 
<400> 97 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



<400> 96 

acaasgcmac gcctgacagg tmaccggatc cgggaattcc cgggtcgacc cacggcgtcc 
gcatttgcaa taacagaaaa ggaactgcat gtatgaagtt ttcaatcgtg ggcttttctt 
a!^S^^^? ^f^^ggtcgg gggatagttt gatttccatt .Cctgaaaac gacagacttg 
gattctgttt gtgtgtgcat atttcatcca gcctcaagtt ataaagctca tctgccccg? 

tgtatgt^tt ITctll"": ^^^--^^^c tcgcgggtgt gcgtgttcac tgcgtgcg?c 
tgtatgtatt tttctgtcat cactgttccc cctcctcccg agtgtgcatt cagtcaatat 
TtllT.lr ^'^^"^"^ caaagcgcct tgaaggtctt gaaJtLtgc gtgagSa^ct 
ttatcaacta tcccaattgc atgttctcca tcacatatcc tcttattngc tctgtacccc 
TalT",lll f^l'^^-^- tatcggaaca aagctgtctg ggtaaggagt agJc^Sc 
tata^^S!! *^^^tacactt tagtctagtt ctrtattcta aatctggatr gccagta?tg 600 
tg Jaga?ct IZTaT'" ^^^-^^^ ttttttcggc cacagagtaa Lag^tttca Ho 
cgtaagatct tcataccaaa gtaggaagta aaaacagctt agaaagctct atcaaatott 7oa 

llltiZlll T^lTolT acctaaJaaa ga^a.La'c" ^T.llT.l 

taacaaac^a r^™^^ Saaagtacaa tgcaccacat ttttgtagaa gtccactatt 840 
ugacaaacag ttgaaattca agatgcgtct gacccttagt catttttact ctttaattct Inn 
""'^'^"-^ cgtacctgcc ttgtctatJr ttttctrcac cll.llTcll 960 
gtatgacata ggaaagtcat ttttttttag aattcatgga tcagtctgat ctactcctat iltc 
tcataatgga acatgtaaat acaccgaaaa ctgcttt^L ggajagaa« atgagctjga 
ccaao^r' "^^^^^^-^"^ ctaatgttcc aaaatcccca tcagagaagg ta?gatg??c 
ITcltlttll ttagtcgatt gagaatgcag gtttaacaga agagataagg 

ggcataatga ctgctggttt tccagactgg acttrcctac cgcaactact aacgctctca 
ggaccacctt tgtgtataca ctcgtagttt taaaccttac .rrLrtfr-t 
atllltaltt ^"'^^^'^^^^ ggtagaatcc aagatggctg cacrccagct gtacgataai 
ccac?a!ftJ ^.r^^f gtgtggcatc taaaaataat gtctttacag catctctctg 
• ^^^"f !" gttgacttga atttcgggaa aaaaaaaagt cggcgttgat atgcatatgt 
gtgtgcgtat atatgtattt ataaacaagt gtgtttgagc aacaagtjag tcLatagJc 
ttcccctacg catgtgtatt ccacacacaa acggccgagt Catagtcata aaacaatJtg 
tqaatctac^ tttt^T^ cagattgtca gttaaccagg aaacagttaa tgctttttaa 1680 
I?aattfa^^ ^"^'T^'' gcaaatgtcg tactaattta ggccaatttc taatactacc 1740 
tgaJtttta? attac^^" '^^"^^^gta gaaartacta aaattgtggg gagttttttc ISOO 
taa^^t^n^^ ^"g="tag gaaacatctt tactaattca gctgtcttag gtaaaacgaa 
ctt^a^a^^^ cccgtttttt tatgtgtcac tgttagtggt ctcagaactc CgatcagLa 
tlcllllltl IT.TallT" ^^"^^^^^'^ ttgaacgatc cagccgaaaa cgtatcJctc 
tacccccttc agttgtagaa aaggccaa=c cccctcagtg tcccacat^a taccaacct-a 
Sec™ ^^^""-gg gagaaataaa catacggjgj cttcagt^g? cltgg^catg 
a.^trt agaaaccaac cattcagttg Cctcaatctc aattcgttcc accccgtqaq 
gagtctgtct ccatcagttg ttgactcccc aaaacgccgc accaaataa^ agtcq^caJc 
9 cc Set T^l'^tlT' cccccatg?. ctcta^c1a\^ ^acL^g^a^g^ 

?t?ttf^f^» ^-g;'=tgcaa ctactaagcc Caaagaaaag cacacagtca cctcatcccc 
caaac^acaa t^c'^'^"'^ actccccaac cca.cccaac at.gacatgc catcngtgga 
taatJa^^^f ttctcagaat ctagrcaagc Cgccatcacc cccctcgcct tggccgttca 

s«c:::; :--f:^- —-^ -"'c^,'.„ ^ 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 

aaagtiggct 2580 
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aoatafft^r cccgggccga cccacgcrcc cggcaggcag aggggcagcg ggcgcgggga 
aatt^ttaaJ ITrtTt"" "^^^^^^t cactaaccaa aacgaaaca? ??gccaa?It 
c??ta1"ct lalttr^ tacaacaagc caggattgcc cagaatggaa tt?tgggaga 
tgSctat^tt T.r T^^ ccaatagaga acagagcatt ggggacatcc aggt???aL 
a??citocJt "gcccccaa agaccttcct cctttaccca agLtgtggt 

cctTtcTcl Ttttttltl^ cttccatggt gggaaccaaa ctccggcaga cLaggaJgc 
cccctccaca actctccatg acctccgacc ccaggaccgt ttcagtatca ccaaa-tr^«. 
caaccggatc aaagtatgga aggaccactt gataccagtc actc^agaca gca^Jaaoa! 

E 51iF F~ ^ ii 

caaSatcctc a^cLc^ccc ^f^"^^"^^ gaagcccacg gtcggggaga cgcaca??c? 
aagaccccc aacaacaccc gagaggccgc ccgaggccaa gtctgcacct ccarr-^hhr,^ 

c^Saa^Jc ^r^''""" tcaggctgct ggagaLctg ?cgc?ggaga acJg^Jgc?? 

i HE =i H= ™ ^ 

is^ss^ - ~rci 

=° - - 

i Si F'— "~ = 
c„c, -™ s^:?,'^=:c* 

gcagccagga ccttcgccca agaagccata ccagccaaga at JaaaL? ?t!aaa?Jtc 
clSlllll ttgttgtgga cttccccctg agcagactca ccgjgjgjtt 

tatatr. J ^^^^^^^"=^3 gggacatccc caggctggtc tccgatcaca gggactctgg 
tgtcacagtg aacggagagt taattggggc acccgcccct ccaaacggcc acaaaaaaci 
gcgcacccac tcgcgcacta tcaccatcct catcaacaag ccagagfgac cttatcccoa 
IZlTtlTtl tcttggatgg tggggacaga ctggtgc'cc cctTclllll 

gagtgtggtg gtggggagct ggggkctgga ggtgcccgtg tccgccaaca ccaa^«^^»^ 

^sSc :n'.iiir,t 

rfrrr^r..^^ cgacaccacc tgggtttcta cattgccaac agcgagggcc tctccaoca« i qoa 

iT^itiiiTc :iiiiitiit 'tiiT.iitt ii'ttt'r 't'"'"» "lo 



<210> 98 

<211> 1957 

<212> DNA 

<213> Homo sapiens 

<400> 98 



taacair^nn ^^^^^^"^^ ctcatggcgc cggcgtcgcg gtcgctcgcg c^ccggacac 

5 !ES ~ ~ ' s:i 
S i 5 ~ 

aaa^^^.r^^ gctgagcacc gccaacaccc actcccacca caaagtggac ttaccc-trr 

Eii iL— 

? fSS^- 

•cc...... ss: - 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



2040 
2100 
2160 
2181 
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cacaccagca 
gccccagccc 
cagcaaaagc 
gcggtcgggg 
tcccctggac 
ctcagagatg 
ggtcaggggg 
cagagggcag 
tcacccgcag 
acggctcccg 
tctgtggcac 
gccatcggca 
tcgtccatgt 
accccacctc 
acgtgcacac 
gcaggcctgc 
ctcttccagc 
tgaccacacc 
aaagccttcg 



cgtcccacct 
. agcccaccct 
gatgccgaga 
tggccaggaa 
caggacgcca 
aatgcgtcaa 
ctccgggtca 
ccatggggcc 
gtgctcctcg 
catcagctca 
aggggcacac 
gcaagcgaca 
ccgccatgta 
cctcctccac 
acacagaccc 
tgggcccgga 
tccctggggg 
ccaatctcag 
ataaacaaaa 



cgcgctcacg 
cacctgcttt 
ggggaaacag 
cataaactat 
ggcagggcag 
taacctcctc 
cggggtcaaa 
caggaccatg 
atgcccttgc 
aagctgatct 
ggtcagaggc 
cacactcacc 
cttgtcctgt 
ggggcacaga 
acatgtgggc 
cgtggctgtc 
ttgaccagga 
agccaacatc 
aaaaaaaaaa 



gatttcacta 
tccagcccac 
tccagagtcc 
gtacaggggc 
ggaacctcag 
catagccaag 
atgacccaca 
ccactggccc 
ggtcgcaggt 
cgccacacag 
tgaaaagggg 
ttcctcttct 
gaagagttga 
cccaacacaa 
ggggggcacc 

gtcctcacca 
gccggtcaga 
cacactcccc 
aaaaaaa 



cacagatagt 
aaagggggac 
aacagcagaa 
cgggggcttc 
tagtcctcca 
ttggggacga 
cgctgcagcg 
tgctccccca 
gatgccactg 
gtagccgggg 
cactgcacga 
catccacctg 
gtgccgtgct 
ggcggggatg 
ctcacgtgct 
cccccgtggt 
gatggacctg 
cacatctcct 



ggcggcaatg 
gatcacggcc 
cttgggggaa 
Cgcccagggc 
cccagccatt 
gctgttcctg 
acaagaaggg 
gccgcaggcc 
ggcgtgatgc 
atgtcccgct 
gcacctgcca 
agaaaaaagc 
tgggggagac 
ctcccacgcc 
cggcctcaat 
tccgctggca 
gccagatgtc 
gcctgccagt 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1957 



< 2 1 0> 99 

<211> 1112 

<212> DNA 

<213> Homo sapiens 



<400> 99 
ccacgcgtcc 

ccatcgaagc 
ccagcgctgc 
tatggtgctg 
ccccaggcac 
ggtggagcag 
ccttatgcca 
agattccaga 
gaggcacgag 
cccaaggaga 
agccagtcgg 
gcagcggaaa 
tacagccccg 
ggctagagga 
tctcacaagc 
aggacccctc 
tgaaggttaa 
tcctcctttt 
ataaacccat 



gcccaccaac 

atiattggaaa 
ttcatcttcc 
gaaaatcagt 
ctcggccacg 
gtgatgccga 
aaacgtatga 
ctggcaacaa 
agcatcccac 
caggtggaac 
ccaagcctgc 
aggcccagcc 
caagaggcgc 
acaccagcac 
cgccaaagaa 
caccctcaaa 
ccatttacac 
aaggaaaaaa 
agtaaccaaa 



aaaggcaccc 
aggaaccggg 
agtgctgata 
tcatgtgctg 
ggacagaaga 
tctccattac 
gggcagaaga 
gagcccLgaa 
ggtggcaccc 
cctgggggaa 
ctctggccaa 
caagtccgaa 
ggctggacca 
aaacgacagc 
cacatcccaa 
cgcgcatgtc 
cattctaaaa 
gtaacagatt 
aaaaaaaaaa 



gcagttacat 
gaattcatta 
attatggcat 

agaca tacag 
cggcaggagg 
aggggccaaa 
gagatttLga 
gtgczccggg 
agtcataaac 
ggcacaccca 
gacacaccag 
gccgcaggca 
eg wggacagg 
ctcaagcctc 
gcagcaanca 
cagagaag::a 
atarttcaag 
gtc wCCttac 
aa 



tcaccacatt: 
aagcactcac 
tagccatcct 

gcggtcctga 
aaactgatna 
tgggccccac 
gagagagaga 
cacccgatg' 
cacccgtttr 
aagaaagcag 
ggaacacaga 
gcccagacca 
atccggtcag 
c wtcgagct - 
cagargatga 
ttaacgcctt 
Cugtaccaca 
cgtgctcgaa 



tgtaacggag 
gaaggaaatt 
gagtttctgc 

gagagaacct 
Lagacctgat 
cgagcaaggc 
tgttgacttg 
accagacgca 
ggacacaaag 
cactgaaagc 
aggttcaccc 
aggcagcaca 
cagcccctgc 
cgtaaagtaa 
agcaccttac 
aatagactac 
gtgctataaa 
aggaaaaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1112 



<210> 100 
<211> 887 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> SITE 
<222> (303) 

<223> n equals a, eg, or c 



<400> 100 
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ggcacgagag 
ggtctgggta 
tatggactgt 
catgcacatg 
tctcactcta 
tcntttagaa 
ctctttttgt 
tttcatgtaa 
tatcctttgt 
tctctgttag 
cattaaacat 
caccctttgg 
ttaaaacttc 
aggccgaggt 
aaccctgtcc 



aattataggt 
gtggccactt 
gcccctctct 
tcaaaacttc 
aagtcctatt 
acgkgctgta 
aaccacaaaa 
cagtatgtct 
aatgactcca 
tgaacattca 
atcttttgac 
ctagattctt 
ctcaggaggc 
gggcggatca 
ctactaaaaa 



gcatgatggg 
gtgtgtattc 
acatgtgtgc 
aaactatatt 
tcctctcttc 
aaagaacaat 
atataaactc 
tagacacctt 
taaattattc 
ggatttttcc 
acatgtcctt 
tagcacataa 

tgggtgtggt 

cgaggtcagg 
tacaaaaaaa 



gctctcggag 

actttatgcc 

aatgcttcaa 

aaagagcgtg 

agtctaccat 

gatgtgtatg 

accacatatg 

ttcccatcat 

cactgtgaaa 

agtttttaaa 

gcacacatgt 

cgtaaatatc 

ggctcacgcc 

agatcgagac 

aaaaaaaaaa 



actggcaatg 

aacccattga 

caaaagtgtc 

caattaaaag 

tactactagt 

tctctacagg 

ttactctacc 

cgcgtggaag 

acacaccatg 

tattacagtg 

ataggtatga 

ccatagagtc 

cgtaatccca 

caccctggcc 

actcgta 



ttctgtcttg 
actgcacaga 
aatagtgatg 
gaagtcatcc 
ttctagtata 
tatatattta 
acatgctttt 
ctatcaaccc 
tttaaccagt 
acatattaaa 
tggactttaa 
aaaaccaccc 
gcactttagg 
aacatggtga 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
887 



<210> 101 
<211> 1248 
<212> DNA 

<213> Homo sapiens 



<400> 101 

ccacgcgtcc 

ggcagtgtct 

gatatcaatg 

aaggatgacc 

ttctccacct 

aacatgctgc 

tcttcagacg 

aggcccctgg 

acccccagct 

catgcaccct 

gctgaggctc 

acacacccag 

ggccactctg 

cctgccccat 

ccgccaggcc 

agtgctccaa 

acccttacta 

cccacccaca 

tgctcttctg 

ttatattcct 

aaaaaaaaaa 



ggtctatcaa 

cctgtgagac 

accttggtac 

agcccccagc 

atagccagag 

gacggcagga 

actcatccag 

tgcagcagcc 

ctgggcccct 

acagtcggac 

cagtggaggc 

ctcccaccca 

ccacaagctc 

ccgcacacct 

ccactcacac 

gcccccccac 

c tacaggccc 

ctatgccaag 

tatgtgacaa 

ctaaaactct 

aaaaaaaaaa 



ggtgacagaa 

caaggacctg 

catcaagccc 

tgcttcttct 

cccaccggac 

ggagctggag 

cccacagccc 

cgagcccctt: 

ggatgaggag 

tctgagccac 

cgtcggccca 

cggaaagcac 

caccctcggt 

agactcagtt 

caccacaggc 

tcacaccacc 

taccctcaai 

ccctzczzcc 

cattcttgca 

caaaaaaaaa 

aaaaaaaaaa 



ctgaagggcc 

tt tgccgccc 

agcctggaag 

gtcaacaagg 

acaccctcac 

aacgggacag 

tcaggcactg 

cccatccaag 

ggggccgtgg 

atcagtgagg 

aaaagcctar 

cccagccccg 

acaacaggcc 

cacaagtcca 

tccacctata 

acaggcrcca 

a tcataggcc 

cacagcaara 

catcatgtita 

aaaaaaaaaa 

aaaaaaaaaa 



tggccaacca 

tgccccaggc 

tcacatggag 

cccccacagt 

-tcgggaaca 

cacggtccct 

cccgccactc 

t tgccctccg 

ccccagtcct 

ctagtgcaaa 

cctggggacc 

ttcctcccgc 

czgzccccac 

cagacticLgg 

gcgccat::ac 

cccacaagcc 

cagtccagac 

gcccccaata 

ctggaatcct: 

aaaaaaaaaa 

aaaaaaaa 



tg-ggtcgcg 

tgcggctgtg 
ccccttcgac 
caccaagcgc 

ggctttctat 
gccatctgaa 
accagcccct 
caggcctgag 
ggcaaatggg 
tgctgccttg 
tagcccaccc 
cctggaccct 
atccacagac 
cccttcagaa 
cactacccac 
cacaacccct 
caccacaagc 
cgtagatttt 
ccttcataca 
aaaaaaaaaa 



60 
120 
180 
240 

300 
360 ■ 
420 
480 
540 
600 
660 
720 
780 
84 0 
900 
960 
1020 
1080 
1140 
1200 
1248 



<210> 102 
<211> 1841 
<212> DNA 

<213> Homo sapiens 



<400> 102 

. ggcacgagct 
agggcaacga 
ctgatcatct 
ccagtatttc 
caaaaggtac 
taattgtgaa 
gtggactata 



ggtgcaggag 
gcacaacagg 
cttgcaaatc 
aagagtcact 
cttaatttga 
gtaatgtaag 
aaattctccg 



ccggagcagt 
agctiacgagg 
tgccagcgca 
tctccacttg 
agaagtgucc 
catgtggaag 
catccagcat 



accagtcgtt 
agtnggtcct 
tcggtcctat 
gtgcaggaag 
tgccttgcat 
aaaggtgaria 
agaacatcga 



gccgaagaga 
gt-ccaataag 
gctggataaa 
gcagtctttg 
taatagccta 
atgtaactac 
actucattac 



tcggactggg 
cacgnggctc 
gaaattccac 
ctacgtacag 
ataattatta 
aaatcacgcn 
ataagtgtt::: 



60 
120 
180 
240 
300 
360 
420 
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aatgggtcct 
aatttattct 
tagctcactc 
agaggtattt 
tccaaatttg 
tttactaatt 
ttttcctgtt 
gtgtacatgt 
acatcaagtt 
ttagctgcac 
tgccagactg 
cactaatcag 
ttctatttta 
cttgtgtcaa 
ttagaagagt 
tggtacccca 
ttcatgtaaa 
tatgatgcct 
tgctccaaac 
gtgcatccgc 
cctacgcatg 



tccgcattaa 
gagatgttgc 
acaacttcat 
catttgatat 
gtagcagctt 
taatcttttt 
agcttggtat 
ggccaaactg 
catcccatga 
caccagtata 
gtttgttagc 
gacactgttt 
gccggttata 
gctccatgtc 
ctccagcaag 
ggaggatgcc 
atgcaatacg 
tccctttcct 
ccargttctg 
cggtaggtca 
zgczcttcag 



aaaagttgac 
ttgaggttut 
tgaatgcaac 
ttagcttctg 
gaaaacaagc 
gccaagttat 
atatgactta 
ctaatgggca 
ttatttttta 
aagcttaact 
tctagttgtt 
atatrcttcc 
cggtagttaa 
gttg.ctttgt 
ctgccctagg 
ttgtaactgc 
tctgctagtt 
ctttgagt::a 
ctgccatctc 
aggagggacg 
tagaggatitt: 



attttaaaac tgaacaagtc atgaactttc 
agcttccatt tttgaaaagt tgaaaaaaaa 



aaagatatag 
accaatgtga 
cgtacgtgtc 
aaagcccttt 
accacaaatg 
gtgccagcaa 
gactaagtat 
aagattgttg 
tcacgagcta 
gagaaagaac 
rcttagtgct 
tc'tcccgtgt 
tcggtagcct 
gatgagcagt 
gtaggggcac 
tcctitggtca 
cggggcaccc 
tgaagaaact 
ctaragagga 
gactttgaac 
nctgrgatcc 
arraaagtca 
acacaaatca 
aaaaaaaaaa 



acgttaagct 
gttcttaact 
taataaactc 
gttataaagg 
taaaattgca 
aatgttcgaa 
atgtatgtga 
gttcttccga 
ttgctgatgt 
acccccccct 
ctagccatgt 
ccctattcac 
gttttcctct 
gacttagaac 
actctctgtt 
tgatagtggt 
gaggctcagt 
aatttctggc 
gggcagactt 
tagtgcLcag 
cacaatgaag 
tggcaatatt 
aagaactttc 
a 



aaaaagcttg 
ttccattttg 
ttcaaacgct 
gaagtgactt 
tggctctttt 
atngctattc 
tataatatat 
tgctgagtat 
gacgctactc 
cttagtcccg 
caccctgtgg 
tttgtaatta 
gttttcttta 
tggccagacc 
tgttttgtac 
aatgaatcca 
agaggtttag 
acagtgtgac 
ttttttctta 
agtcacagat 
ggaaagctat 
ggccaaatta 
tccctataaa 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1841 



<210> 103 
<211> 685 

<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (678) 

<2.23> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (679) 

<22 3> n equals a,t,g, or c 



<400> 103 

actacggctg 

tggtgttcgt 

taactacatt 

taaacaagtg 

tgtcatcgca 

gacacactat 

gagggcagct 

gcttcctcat 

gccgrggrtg 

tcaccctnag 

ttgcigcatt 

aaaaaaaaaa 



cgagaagacg 
cttctccccc 
gtctgattca 
ggtaat ccca 
ccggctcatc 
ggtgccgagt 
gaagtcacac 
giatcttuac 
aagrcagccr 
aaccgtgacc 
cccaccacat 
aaaaaaanna 



acagaagggg 
ctcgattgct 
gaatg-gacc 
gaatcgatrg 
tzccztczca 
ggraacargc 
atgaaagaag 
agcacgaccc 
acaccacagz 
atagcagcat: 
aaagtattzia 
aaaaa ' 



ggcggcgacg 
gcgcgctcat 
acattaacgc 
gccacaccac 
acccacccg^ 
gagtgtc'ga 
ccatgaccaa 
cagctttgac 
gcacagttga 
atiattccccc 
aaaaacacga 



gaggaggagg 
czzcctctcg 
cagatcatgt 
rgrcaccgta 
zgccacctgg 
rccaacagaa 
gcctggttcc 
aaacgactga 
ggagccagag 
cctggaacaa 
aaaaaaaaaa 



acggaggcgg 
gtctactcca 
tgctcaaaat 
ttactgctca 
aatatatatc 
atacacaatc 
cacttgcccc 
agctggagaa 
acctcttaaa 
aaaactatct 
aaaaaaaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
685 



<210> 104 
<211> 1168 
<212> DNA 
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<213> Homo sapiens 



<400> 104 

ggcacgagcc cccgcccccc ccctcccccc cctcccctcc ccggccccgg ctccggcccc 60 

ggcccattcg ctgtcgggtc ttctgctagg gaggatgtcg ggttcgtcgc tgcccagcgc 120 

cctggccctc ccgctgttgc tggtctctgg ctccctcctc ccagggccag gcgccgctca 180 

gaacgtgaga gcacaatctg gacaggatca gaagtagaga atgaagttgt aagagaaagg 240 

gaaagacaga agaaaggctg cagtagtaca aggagaaaag caggacgcaa gaaugaggaa 300 

tgaatctctg cttgaggagc atccggaaaa anataagctg tcagaaagag taaatagacc 360 

agggacctct aaagtaaatt cacacatcaa agctaaaata atgttggaga atcacccccc 420 

gtgaaaaatt ggcttagctt tcagtacgcc tctttaaaca aaacattatc attctacaca 480 

aacttttaaa atgttgttca taatatggag tttaattcta agcaccttag ataatgcatt 540 

cgttagcttg tttgtgaaat tcttatggac ttcttttttc aaaattccta attttagttg 600 

gtaaggatta acctcgggaa gacaggaaac ccctccagta aaactaattg gttataaatg 660 

gtcacatatt tcaggcttat atacataaca aaacctccaa ctgaatttaa aagtgatctt 720 

gtgtaaaaac attattctag cgatactgat gtctaaattt aaaagagtga acacacaagt 780 

aaaatatatc tgctttttaa ggatctatcc agtttaggaa ggagagtcaa aacttgtatt 840 

taatctcaat ntattatatc gtctgaaact gaaagcaggc ctaacctttg ttcgctcctg 900 

tgtatgtaca aaggcaaaca tttaccaatc aacccttatt aatctgagac cattctgacc 960 

tgactgctca gaacttttgt ccatctgtat agaaatgggg ctttccaaat atctaaagaa 1020 

tttcctacgt atgtaaattt acttgcacac agtaagttag gggagacatt taatatccct 1080 

cagatacctg ccgttttctt gttgcgtccc ccgctttcca aataaacaaa ccgagtgcta 1140 

tctgctcatt aaaatagaat gtagtcag 1168 



<210> 105 
<211> 1175 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (24) 

<223> n equals a,t\g, or c 



<400> 105 

gcacggaaat acatcctaca catnaagcgc cgtcctgagc cgtacacwta gccacattta 60 

grattttatt argcaggatt caaggcctcc taaaaacgga tatccagata tttaaaaaca 120 

cactactaag gaatigcaatg taaaaccaca ccaacanntg catagcactt tatagttnac 180 

aaaatgccct cacacatacg agctcatcca zzzczccctc zcztctttcz acccatctac 240 

caaggaatat cctctacaag ccaggca-ca ggcttgacac tgaagacaca aaatgaaaaa 300 

gacacacaag tctctactcc caaagagctc gcagcttaat gaagcgaccc aaaaagtaaa 360 

cagttgcaat ataacatgca aaggagcgcg acgaaataca cccagagtac tatgaaaatg 420 

gcatiagaggg accaagccrg taatgaagtc agagaaagca aaagaggtga agcckgagct 480 

aagtttcaaa ggatgaacaa aaattgacca gacaaaggag ttggagaagc gggacggtac 540 

aggacaaaag cctttctaat trggggaagt gcgagtcgcc gataccagac gatatcatca 600 

atagattgcg cacagagtgc acatggtaga gagg::gagag atgaggtagc actgccaggg 660 

cagatagg::g aattctcaca gtaaccc~gt gv/cgcacmt~ atcatcctcc cccccactat 720 

gcaggggaga acaccaaagc ccaaagagat caagtaactt tccagaattt cgatttcaaa 780 

tcctaagcgt tcggtagctg tgtcrccctg catacagtct gcctcaaaag aaaatttaaa 840 

gtcgeatctt gcatcaaacc taaacaaara raggccgggt gcag^ggctc acgcctgtag 900 

tcccagcact ccggtaggcc aaggtgggcg gatcacctga ggtcaggatc ccgagaccag 960 

cctggccaat atggtgaaac cccatctcca ctaacacaca aaaatcagct gggcatggtg 1020 

gcacacgcct gtaatcccgg cuactcggga agctgaggca ggagaatctc ttgaacccag 1080 

gaggcggagg ctgcagtgag ccgagattgt gccactgcac tacaacctgg acaacagagc 1140 

aagattccgt cccaaaaaaa aaaaaaaaaa cccga 1175 



<210> 106 
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<211> 1021 

<212> DNA 

<213> Homo sapiens 

<400> 106 

ggcacgagaa taaaatagac ttagggtatc cgacac^attt aatacataat gaatagtgaa 60 

tttcttgaat aatataatgc ctaaatgtga ccgatatgca aaatacttaa atccctitaca 120 

ggatgtgcat ctaccaatca gaacagatgt tggcttcact gcttgtatca gttttctgta 180 

ttagtttgct tgtattagtt tgctggggca gccacaacaa agtaccacag aaatttatat 240 

tctcacaatt ctgggggcra gaagaccgag ataaagatgt ctgcagggtt ggtttcttct 300 

aaggactccc ttgttggctt gtagacggtt gtcgtctccc cgcgtcttca cctgcacttc 360 

cdtctgtacc tgcccgtatc ttaacctctt cccataagga cagagtcata tcggattagg 420 

gccccaccct aatgacctcc ttttaactca atcacctctc tagagactct gi.cttcaaac 480 

acaggcacat tccgaggcac tttaaggtta gaacttcaac atatgcattt ggaactgggg 540 

cataaccgag tccatgacac tattggagag ggttctggag ccacaacaga tggaatgaat 600 

gggcctcgtt tggattctaa tttaaataaa ccaactatag aaaaacatgt tntaggcaat 660 

cagatttttt gtcatgggcc aaatattcag aagacgttaa gaaatgttta ttaattccac 720 

taggtaaaat aatgttggtg cggtcacgta gagtacgact cacgtgcaca gaacagaatg 780 

ataaggtgtc caggattcgc ttiiaaaatat tctagacc::c rtcaaaaggt gagaggaaat 840 

gacacatgta tatccaaatt ctga^iggctg rtgaagccgg ccggtaggca cctagtatct 900 

cattacactt ttccccttgt gtgtciggaa atrrtracaa taagcaggaa aaacaaaagt 960 

aagaatatat gcaccataga taactcgaaa tatgcagaac catgaaaaaa aaaaaaaaaa 1020 

ai n n 1 
JL U ^ U. 



<210> 107 
<211> 830 
<212> DNA 

<213> Homo sapiens 



<4"00> 107 

ggcacgagta aggactgtgt tcctcacgca ctccttgatc caggcacggc agtccctctt 60 

ttcctgcaca tatccacact cctgccactr czaccczLZc ccctacccct ccgcrtLtca 120 

cctccgactg caaaaagaag cagcagtrcc gaaagcaaga g!:rccctatg aacacggaag 180 

aagacattgg caacttttga gtacaacaac cacacttaac agagcaarrc aagaacatca 240 

gccagcgaac ntcatacaag acagtgaaag agaaaaggaa garraactag gggcagcrta 300 

ggatgccatc aaacagccca gaatcacggg agcagccgnt caacagaaag gaggccacaa 360 

atctgaggga cacaagccaa gaattggcaa gccaagaaga aggaaaaggr tcgggcagca 420 

aggacaatga ggaacaaaac agagaaccca gaagcaacat cigaccgcca tcattggaag 480 

aattttttcg ccrgctcgag gctggacacc gaagcggacc aggatacrcg agtgactacc 540 

tgatgggctt ttggaactag ctcccaagag gtgaaaatca gcctrtcttc cttctticttt 600 

ctttttttut tictcttgagg caaggtc^Lca cngttgr.tga ggcrgaaccc cctgggctca 660 

agcagttgtc ccaczgcagc ccccccagac actccgtaag ccaaggcagg gggaacactt 720 

tgngctcagt agttcgaggc cgtggcgagc raagatcaca ccgctgcgct cacrtcagcc 780 

tgggcaacac agtgaaaccc cgtcrccatc tgrccaaaaa aaaaaaaaaa 830 



<210> 108 

<21i> 1301 

<212> DNA 

<213> Bono sapiens 



<400> 108 

aagagggcgg ccgccgagcr. cttgcccgtc ccgcgctctg agagcggcga gtnaiggaga 60 

cccaggtccc agcccacccc acctcagcca ggcccgcaca ctgacccctg ccttgccccc 120 

cagcgccccg atccatcaag cacacaggcc acgggaacgc cgccggccLt ctggctgcca 180 

ggggccccac ggcaggaggc cggcccgagg gccagcactc agaggatgag gacacagaca 240 

cagacgagca caaggaagcc aaagccagca caaaccccgt gaccgggagg gtggaggaga 300 

agccgcccaa ccccacggag ggcacgacag aggagcagaa ggagcacgag gccacgaagc 360 
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tggtgaccat gtrtgacaag ctctccagga acagagtcat ccagccaatg gggatgagtc 420 

cccggggtca tcttacgtcc ctgcaggatg ccatgcgcga gacnatggag cagcagctct 4 80 

cctcggaccc tgactcggac cctgactgag gacggcagct cttctgcccc cccatcagga 540 

ctggtgctgc tcccagagac tcccttgggg ccgcaacctg gggaagccac atcccactgg ■ 600 

atccacaccc gcccccactt ctccatictca gaaacccctt ctctcgactc ccgctctgtt 660 

catgatttgc ctctggtcca gtttctcatc tctggactgc aacggtctcc ctgtgccaga 720 

actcaggctc agcctcgaac tccacagacg aagtactttc ttttgtctgc gccaagagga 780 

atgtgttcag aagctgctgc ctgagggcag ggcctacctg ggcacacaga agagcatacg 840 

ggagggcagg ggtttgggtg tgggtgcaca caaagcaagc accacctggg attggcacac 900 

tggcagagcc agtgtgttgg ggtatgcgct gcacttccca gggagaaaac ccgtcagaac 960 

tttccatacg agtatatcag aacacaccct tccaaggtac gtatgctctg ctgttcctgt 1020 

cctgtcttca ccgagcgcag ggctggaggc ctcttagaca ttctccttgg* tccccgttca 1080 

gctgcccact gtagtatcca cagcgcccga gttctcgctg gttttggcaa tcaaacctcc 1140 

ttcctactgg tttagactac actcacaaca aggaaaatgc cccccgtgcg accacagatt 1200 

gagatttata ccacatacca cacatagcca cagaaacacc atcttgaaat aaagaagagt 1260 

tttggacaaa aaaaaaaaaa aaaaaaaaaa aaaaaactcg a 13 01 



<21C> 109 
<211> 1932 
<212> DNA 

<213> Homo sapiens 



<400> 109 

aatttttttt tttttttctt ttttcttgaa aaaaaaaatg ggtagtgtac attctgcagg 60 

tttaagacaa cccaggacaa taaaaacaat ggactttaca tgtgtatata tatagctctc 120 

ttaggcacca caatcagtat gagccaacaa cattcaaact cgatrcaggc cacattcaga 180 

catttgctct tacatacaaa tatttaaatt aaatacaa^c cgaaatgtgt cctgttacat 240 

acaaaaaagg aaaaactata caacgcagag cagtgtgtgc gttctaaaca attacattta 300 

catgtaagct aaatggaacc agcaacggtg ctcaagtttt tatcatccct cccagaaaat 360 

cttttcctac cacctcttct attttttgcc cggcttcgcc ggaacatggc tcgcggttct 420 

ccagtttcat gtccttattia gggaaggcac tcgagcagag gacaggaccc cccgagtgtc ...480 

ctccacatcg gcttgcgacc ttgccgttga agacttgacc gagcacatcg aagaacggca 540 

ggagccgctc cacactgcgc acggcgcaga tggtgagcag caagcgccct ggctcccaac 600 

ccaacgctct ccctgagttg tcttcctctg gattctccrg cagaaaacaa aaagcgaact 660 

ggtattaaca caacagacaa tgtggtatgt tiagaaaaarc aaaaatatau aaaccttggc 720 

aactggtcaa gaaacgaaca caaacgacai xaagttccca actcczgacc ccaccaaaac 780 

cctcggcgct: tctgagacct nttaccgcca ccLaccagrt ccacacggag cagtccaaca 840 

ttgcagtaat agttcccaac tagaacgcgc agataagcrt agrcaacaga aacagctctg 900 

aacaggaata gagtcaaaca taaaagrtrr aigctgcgc:: ccgcacrcac ccaaaaagct 960 

cccaggtttc tgaaccccca ctac^gtaac caaggaccag gzcacaaaat taccacagaa 1020 

aaaaggaaca aagcgcttca tacatctcac aatacacccc ctzctactac aaccagttaa 1080 

ccccccttta tccaaatggc ccaaatccgc catgatggca gcagtg::cca aagtgaataa 1140 

ttactgtcag tactgcacca cagagaaagg aagggatccc tcaggagaca ctgctgcctc 1200 

cctctgggtc gtgctaaaca acacagggag gaaagctgga cccggagcca aaggaattga 1260 

gttagtgcgc tggctccgcc ataccracgg cacccttggg caggatatac aaaggcLCCt: 1320 

cacttataaa atgggacagt ccaaaactac cct-tagcag agaagtcaaa tgagaaggta 1380 

tgtgaaaacL ctgccaacta aacacaaaga cuaacaactt gggtattaag aggctagtct 1440 

gagaagccac ctgaatcaca caaacacagc lacagacazc accctgrcca gagaaagara 1500 

agagagaaca ggccggtcga acttgggcag aaccacagat: acaarcccac accaaagaat: 1560 

gaaaacaagc aatgaacnag acagaaggaa gaaatcarga agaccragga agcagaacca 1620 

caatccgtca rartaacaaa cggagcttgc cctccaagac cagatgrngc t.cagaaactc 1680 

tcattgttta cctaataacc caatiaccacc agctccctag iggg::caagc agatigcaaaa 1740 

ticcagcctat tctcctccat gcgcrcccaa gcrcattgcc tatctcaaag caaaaccccg 1800 

aaaaaggaaa ataccaggcr ggtgcaaacg laatcgcggc tttcgcat:;:g tcgaaacctg 1860 

ccgctctaca ccggagcaca ccctcaaaca aacgtggcea tgttatacaa aaaaaaaaaa 1920 

aaaaaactcg ag 1932 



wo 00/06698 



72 



PCT/US99/17130 



<210> 110 

<211> 1534 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (1212) 

<223> n equals a,t,g, or c 



<400> 110 

tcgaagatag gctcgagacg gtggatgttg cagctgacca tgcagttggg ttcggtgctg 60 

ctcacacgct gccccttctg gggctgcttc agccagccca tgctgtacgc tgagagggct 120 

gaggcacgcc ggaagcccga catcccagtg ccctacctgt atttcgacat gggggcagcc 180 

gtgctgtgcg ctagttccat gtcctccggc gcgaagcggc gctggiitcgc gccgggggcc 240 

gcactccaat tggccatcag caccracgcc gcccacatcg ggggctacgt ccaccacggg 300 

gactggctga aggtccgtat gtactcgcgc acagttgcca tcatcggcgg ctttcttgtg 360 

ttggccagcg gtgctgggga gccgtaccgc cggaaaccrc gcagccgctc cctgcagtcc 420 

accggccagg tgttcctggg catcracctc arccgrgcgg cctacccact gcagcacagc 480 

aaggaggacc ggccggcg-a tctgaaccat c'cccaggag gggagctgac gacccagctg 540 

ttcttcgtgc tgtatggcac cccggcccct. ggccti-rcig zcaggczact acgcgaccct 600 

cgctgcccag atcctggctg tac-gccgcc cccugccatg ccgcccarcg acggcaacgt 660 

tgcttactgg cacaacacgc ggcgcgtcga grtccggaac cagazgaagc tcctcggaga 720 

gagtgtgggc atcttcggaa czgctgtcaz ccrggccact: gacggctigag ::c::watggca 780 

agaggctgag atgggcacag ggagccactg agggtcaccc tgccttcccc cttgctggcc 840 

cagctgctgt ttatttatgc tttctggtcc gtttgttcga tccttcgctc cttcaaaatt 900 

gttttttgca gttaagaggc agctcatttg cccaaatc^c tgggcttcag cgcttgggag 960 

ggcaggaacc ccggcactaa tgctgcacaa ggttuttttc ctgtcaggaa gaactcgagg 1020 

ccagctgccc acngagtctt ccgtccctga agaaagggag cactgggcag ggcccgggat 1080 

ccggctactg agagtgggag agtgggagac agaggaagga agatggagat tggaagtgag 1140 

caaatgtgaa aaactcctct tcgaacctgg cagacgcagc taaactctgc agtagcgttt 1200 

ggagactgtg anagggagcg tgtgtgctga cacatgcgga icaggcccag gaagggcaca 1260 

ggggctgagc accacagaag ccacacgggc tctcagggca tgccaggggc agaaacagta 1320 

ccggctctct gtcactcacc ttgagagtag agcagaccct grtcrgctct gggctgtgaa 13 80 

ggggtggagc aggcagtggc cagcctzgcc cctcccgccg rccctgrtrc cagcrccatg 1440 

gtnggcctgg "gggggcgga gttccctccc aaacaccaga ccacacagtc ctccaaaaat 1500 

aaacacttta tatagacaaa aaaaaaaaaa aaaa 1534 



<210> 111 
<211> 2871 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (1234) 

<223> n equals a,c,g. or c 
<220> 

<221> SITE 
<222> (1259) 

<223> n equals a.cg, or c 
<220> 

<221> SITE 
<222> (1283) 

<223> n equals a,t,g, or c 
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<220> 

<221> SITE 
<222> (1284) 

<223> n equals a,t,g, or c 
<220> 

<22i> SITE 
<222> (1287) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (1378) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE 
<222> (1912) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE- 
<222> (1913) 

<223> n equals a^t^g, or c 

<220> 
<221> SITE 

<222> (1935) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (1947) 

<223> n equals a,t,g, or c 



<400> 111 

gcggcgaggc ggcggtggtg gctgagrccg zggzggccicci ggzoaaggcg acagctccag 60 

gggttggcac cggccccgag aggaggacgc gggticcgca: agagctgacg crgccgctgc 120 

g^^gcggtgct gctgagctcg gcctcggcgt czzcggaz-na acaaggcagc caggacgaac 180 

cctLaggatt ccaagaccac ttr.gaca-ca gacgagccag caaaggacca cacractgca 240 

ggcagagcag ttgctggtca aatait:tc::c gancagaag aacc.tgaari: agaaccctct 300 

atucaagaag aggaagacag cctcaagagc caagaggggg aaagigrcac agaagacatc 360 

agctttctag agtccccaaa tccagaaaac aaggaccatg aagagccaaa gaaagtacgg 420 

aaaccaggta gtctggacat tzzcczzgcz ticrgatrra rreaggcgac aactigaaaat 480 

tttaagctaa tgaataaaga agctgaagaa gaczggczzc acicacraic acccacaaac 540 

aatatatgga gtgcagttcg gagagaattt cragatt'ca acacaccaaa gttactcact 600 

taacagattt gacccaagtt acataagcrc aatrccaiag azgagattgc acccatctLC 660 

ttattaatat cctgacgttt cttcccgcaa icaagcatg- aaacaccttc cccactgaga 720 

aatittaaaaa aacaacatgt cagggcagca racaaaacac aaactcagga aqaaccgcct 780 

aacatgccac tatggtcact: ttcaggartg rgccaggatt azzactacaa taaccttgtc 840 

gatcatccta ggtaattuga tttaggigct z^zccczcca gcrtgacgcc cgtgtgtact 900 

tccatcctac tccaactttc acaaccicga aaccrc^act zzzcaazzza aaacccgacc 960 

tctttcacaa tagtccccta aatarcaacc czr^taaaaa ctattaacgt acgcattcta 1020 

ccacgtctcg ttgcccttct gtcac-tiacc a-zzcaacga ccaccgccct atc::gcttat: 1080 

tattccatza ctaaccactc tcaggaactg a-ccaaaccg gaacrtc-na aaaaaacaga 1140 

atttttttct actacaaana taccac-cnc azgaacagaa aaccactczc aaaaaatcaa 1200 

aagtacctga atcccatcaa ctagaaacaa cagr.ngccca cancacggag aatactctnt 1260 

caggatttc- ccaggtgtat cannagnttr aagaaaagca aaccggg-ca aaatgcgtat 1320 

tctatgtcgc agcctgcart ttgcacn^aa cagcucacca cgaatgCTict tccacgtnac 1380 
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ttagcttatt tccaatctga tagtcaatgg ctataaaaaa tttcattcgc aaatacatgg 1440 

taccataaac caaaacgrtc atgtcttgcc gggagatcat tccagatgta tctctgtgcc 1500 

tatctgtgaa aacttcctta gaacncttaa agtaaatccg ccgaattgga cgtaaaagtt 1560 

caaaagacag tagatacaca ttgttaaatt gttcttcaga aaagtgattc tgtccacttc 1620 

ccctgaaaat acctctctcc ctgtacctca cccaagtgtg taccaccatt ttaaaatttt 1680 

tactaaacat aatctaatcc attctccat::: ttttaattag ggactttttt tcttttcttt 1740 

tagcttttat acttttagag acatggtctc actccgccac ccaggccgga gtgcagtggc 1800 

acgatcatag ttcactgcag ccctgaactc aggtgatctc cctcttcggc ctcccaaagt 1860 

gccgggatta caggcatgag acactacatc aagcctggga aatttttaaa cnngcataaa 192 0 

aagtagaggg taaancgcca ccccttncca accctcactc agcttcaaca aacatcaaca 1980 

ttctgctggc cttttcatct tcactaccct acacacattt gccctttttt tcttccctag 2040 

aacatttcaa ggcaaatccc gcatgtgctt ttacatctct gtgtccccct ctagcaagaa 2100 

tgctcttttc agcaaaatca cagaattagc cattactcct tcatcaccta atacccagac 2160 

tgtttaattt ttctagttgc ttcaagtgtc tgttcacaat tcatttgttt gaatLcattt 2220 

ctaagctact caggaggccg agtcaggaga atcgcttgaa ccctagaggc agaggttgca 2280 

gtgagctgag accgtgccac tgtactccag cctgggtgac aaagtgagac tccatgtcat 2.340 

aacaaaacaa aacaaaacaa aacatgacga aacaaacaac acaattgaat: aatacccccc 2400 

agttattatt tagagaagag atgacccaat acaaatgiiaa tatgtgatcc actcggtcac 2460 

agaatgaatt tcagtcattg cttagguaag cccggtaaaa aaaaaaaaaa cgggactcca 2520 

tgatcagtgt gttgaaaaag gggtttgccc aaagtgtaac agacagagcc gggtgcagcg 2580 

gcccacgccc gtaatcccag aactitggga ggccaaggca ggcggaccac gaggticaaga 2640 

gattgagaca accctggcca acatggtgaa gccctgtatc aactaaaaac acaaaaacaa - 2700 

ttagctgggc gtggcgacac gcgcccgtag Lcjccagctac tcaggaggct gaggcaggag 2760 

aactgcttga acctgggagg cggaggccgc agcgagctga gattgcccca ccgcactcca 2820 

gcccggccac agagtgagat: cccatcccaa aaaaaaaaaa aaaaaaaaaa a 2871 



<210> 112 
<211-> 1037 
<212.> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222>. (936) 

<223>. n equals a,t,g, or c 
<220> 

<22i> SITE 
<222> (946) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (951) 

<223> n equals a,t,g, or c 



<400> 112 

ccggtccgga atccccgggc cgacccacgc gcccgggcgg gcgcaggacg tgcaccatgg 60 

ctcggggccc gccgcgccgg ccgccgcgcc tccccgtacr. gggcctcrgg ctggcgttgc 120 

tgcgccccgt ggccggggag caagcgccag gcaccgcccc ctgczcccgc ggcagctccr 180 

ggagcgcgga cctggacaag cgcatggacc gcagcacccc ctgccccczz ccggctgcct 240 

tggcccaccc ttgggggcgc cczgagcctg acctzcgzgc cggggccgct tccnggcttc 300 

tcggtctgga gacgargccg caggagagag aagtccacca cccccacaga ggagaccggc 3 50 

ggagagggct gcccagccgc ggcgccgacc cagcgacaat gcgccccccg ccagccgggg 420 

ctcgcccacc caccatccat tcaccca::tc cagagccagt ctctgccccc cagacgcggc 480 

gggagccaag cccctccaac cacaaggggg gcggggggcg gtgaatcacc nccgaggcct 540 

gggcccaggg ttcaggggaa ccctccaagg tgcctggccg ccctgcccct ggctccagaa 600 

cagaaaggga gccccacgct: ggcccacaca aaacagccga caccgaccaa ggaaccgcag 660 
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catctgcaca ggggaagggg gcgcccccct ccctagaggc cctgggggcc aggccgactt 720 

ggggggcaga cttgacacca ggccccactc actcagatgt cctgaaattc caccacgggg 780 

gtcaccctgg ggggttaggg acctattttt aacaccaggg ggctggccca ctaggagggc 840 

tggccctaag atacagaccc ccccaactcc ccaaagcggg ggggggatat c.tattctggg 900 

gagagttcgg aggggagggg gatctttttt ttaaangatt tctcantttt naaaaaaaaa 960 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1020 

aaaaaaaaaa aaaaaaa 1037 



<210> 113 
<211> 2214 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (289) 

<223> n equals a,t,g, or c 



<400> 113 

cggacgcgcg ggtcgaccca cgcgtccggg aaaaarggaa aaracgccgt gcaaaacctc 60 

gttctgcgtc cgaatngccg caggctcaga tcttcatctg aggttctgrg tctgaactgc 120 . 

cgtaggccca gatcttcact tgaggttatg tcccacaagt xaacgctgat ctcgcgtgag 180 

ctttcggtag ctggagtaac acaggcggcc rcacagcgac ctccccagcg ccttccaagg 240 

cacacctgca gccagcgtam tcctcctggg agatgcctcc tcaaggccnt gctccagacc 300 

acgtgggrar ggcctgacaa gccaattccc aggctgtccc cacccttgra gagtgaccct .360 

aaacgctaga cagacgggga atgggaaaga aaagaaagct gcagacctca agttaaaatt • 420 

ccctcaaaaa cgtctttatt tarcrgcttt ttctgaaagg ataaaggctt tttgaaaatt 480 

atttcctaac aaataacatg aacacttcta gaaaccctag aaaaacacaa agcattcaaa 540 

atagaaagaa aaattaccca ttactcttta agccagcatt atccattgcg gtgcttttgg 600 

agctgggtga ggccgtagcc tccgccaagt caaggagccc ggcggcggct gtggcactcc 660 

tgcagggttg tctttttttc ttcgagatgg agtctcactc ccgtcacccc agccggaatg 720 

tggcggtgta aacagcccac tgcagccttg accctgaggc tcaagcgatc crtctgcccc 780 

ggccccctga gcagctggga tcccaggcga gagtcaccac accctgccca tgtLCCtgca 840 

ggtcttgaca tgcgaggacg ctgtgtcttc cctgccacat tcccrtcttc ccccttgaga 900 

cagacccttg ccccatcacc caggccagag tgtggrsgcg cgaacacggc tcactgcagc 960 

ctcgaccctc aggctcaagc gacccccacc cctcggaccc ccaaagrgcc gggatcacag 1020 

gcgagagcca ccacgcuggc ctgaatcttc agggtaritcr cggcrgarg:: gycacctact 1080 

tarecatscc tgrttcaaga gtgtaggtgg zcaccczgzc tccgccgctg acctggcctg 1140 

gaccctcggc cgcgagaggg aggggcgggc tgggctggag gaacctraag ccctcgtgat 1200 

gccacaagcc catctggctg ggcaccccct gctgtgtcct gagcrgcaca rgccccaggt 1260 

ggcccccaca gcagaggcga gccac-grag ggcgragggc ttccaccgac ggtctccagg 1320 

ggragaagaa gggcccaggc ccccaggaga ctcaggagac cagagcctgg ggtcaggggc 1380 

tmagcagggg ccyarccagg gccggatgcc cggagccagc cccgmagccc rgkgktctct: 1440 

gttcttcgca ctcccaccgc ccgrgtgaac agctccagcc ccacccgcgc ctccctgtgc 1500 

tgggctccat cagggagccc agaagacgtg cgtgctrctg aaatcgggtic cccacacgcc 1560 

tttgccccag cgcaccttgc tccCwCcact uacr.a::cgag atccaaacgc ccgttttctc 1620 

cccagaggct gacggataca ctcagacgtt acgacacgga ccaggacggc cggattcagg 1680 

cgccgtacga acagtacctg tccaeggtct ccagtatcgc acgaccctgg ccccccgcga 1740 

agagcagcac aacatggaaa gagccaaaat gtcacagtcc ctacccgcga gggaacggag 1800 

cacaggtgca gccagatgcc gctccccctt cagaccccgt cacctgggga cccagctgta 1860 

catatgtgga caagctgatc aacggttccg caactgraat agcagccgca rcgctctaat 1920 

gcagacattg gatttggcga ctgtcccat: gcgccacgag gtaaatgtaa cgtctcaggc 1980 
attccgcttg caaaaaaacc caccacgcgc ctccctagac gcccctggtc ccacagcgca . 2040 

aacgctttta ttagccaata ggaattctaa aacaacacgg aacctacaca aaaggctttc 2100 

catgtgcctt acttttttaa aaaggagttt actgtatcca ccggaatacg cgacgcaagc 2160 

aataaaggga atgtcagacg tgcaaaaaaa aaaaaaaaaa aaaaaaaaaa aaag 2214 
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<210> 114 
<211> 3300* 
<212> DNA 

<213> Homo sapiens 



<400> 114 
tcgacccacg 
cctgcttcag 
cctgctctca 
agcttcctgt 
tgaagatggc 
tgaagatatc 
gaataaagtc 
ggctggtatt 
acagtcaaac 
acgtataact 
tagggtagtt 
ggggacactt 
gagaccaggg 
gtgcctagtg 
agctggcatt 
gcaccttaat 

CaLC'CCLCciL; 

tgagaagaaa 
tggagttgag 
ttgatatagt 
ctggaaataa 

gcactgcagt 
tagtaaacga 
cggtctgaag 
gcctggtgac 
ctggtccacg 
cttccatccg 
gaccgtgatc 
caccttcctc 
cctctgcctc 
ggtgctgtgc 
gacgctcccg 
caccagcacg 
tgtgactatt 
ttggctcaag 
cctgataaac 
ccccaataaa 
tcagccattt 
gaagacgatt 
gttgctctcc 
gtttagtggg 
ccaaaccaaa 
atcatcatct: 
ttggctaaag 
tttccgtttg 
tagctttcct 
agaacaagac 
aagtgtcgga 
gctactgcca 
ttctggcctc 
cataattcaa 
aaaatgcat c 
gtgggactgt 



cgtccggtga 

agagattgtt 

ggtctcactg 

cctccatgcc 

tttcgggcca 

aatgaatgtg 

ggaggttaca 

atagattatg 

gtggatattc 

atgccatttt 

ctatccaggg 

ttggttgtca 

atacagccca 

ccatatgtcc 

caaagaccta 

aaatatcgtt 

atcccagtaa 

tggaagcaga 

catctggaat 

ctatgaaacc 

caccatggat 

tgccctatca 

aaagggatgc 

gaaaaaattt 

aagagaacaa 

gagggctgct 

cccagctttg 

acccaggtgg 

ctgtgccggc 

ttcctggccc 

tccaccattg 

gaagggctgc 

ggcagatcca 

gctgtgtcag 

cccgataaag 

ctggtgtcct 

gaagtttcca 

atcccgggct 

ggatcaatca 

gtggtacact 

atgcggaaag 

acggaagaag 

gcagagtcaa 

atgaactgac 

tctacagcgc 

gtgaaaagtc 

atctgggaga 

aggatgcatt 

ccttgccaga 

tctgctgtac 

caatcttcat 

accaggtgcc 

aaacLagtcc 



gaggcaagga 

cccggcgtct 

tcctaccggc 

ctaaatatgc 

ggcctggcag 

aaaccgggct 

tctgcagctg 

atcatccgga 

gggcgagcgg 

cctacccaaa 

gtaattttgt 

tacctcgggg 

acatcccaca 

acagagacag 

catcaagcac 

gcctcccggg 

gagccaacct 

gaggacttcg 

gcgagttttg 

aagaggtgca 

atcaactgtg 

cttatcaatc 

aggaagtaaa 

CGCtctctga 

aacatatctg 

ctcatgcgca 

ccgccctcgt 

ggctgaccat 

ccatccagaa 

accccctgt.t 

cagggc ^gct: 

acczcztccz 

agaagaggt t 

caacagttgg 

gatccacctg 

acLcccaagt 

ccattcagga 

gctctcgggg 

ccgcacaccc 

gtccccttaa 

gggcagaaac 

tgggaaagtc 

ccaagcaacc 

ccggcaagcg 

ccczgzggzc 

caggcccat 

at tctcagat 

ctaaagacgg 

aactcactaa 

ttcacgctcc 

atggaccagt 

ggagaggacg 

aaccatcgcg 



cttcccatct 

cagtgctatg 

tctgccagga 

cagctgccac 

gacatacttc 

ggcaaagtgc 

ttcggcaaaa 

ctgttatgag 

ggcgaagcct 

cattaacatg 

cctctgtccc 

gtgatgngtg 

gcgcccagga 

agaaatacaa 

ctagaccccc 

ctccccactc 

eiaCcagct: ta 

caagaagggc 

crtctccagg 

atgagacaag 

ctgatgcttt 

tctcggggac 

actgaaccct 

acctgtgttc 

cgtctaccgg 

cagcaacggc 

ggcrcttgcc 

ctccctgctg 

caccagcacc 

ccrgacgggc 

gcacztcczc 

caccgtcagg 

cacgcacccc 

accccagaar 

gagcttcazg 

cctgcggatt 

caccagagtc 

ccttggttct 

azzcaccazc 

zcgccaggzz 

tgaaagcact 

crcagaaacc 

gcagccacag 

ccacggcaat 

acacacaca- 

cacczazzzz 

ccagagtcca 

cgggcgggag 

ccggcatctg 

caaagatcct 

aetaaagagg 

cggagaaa ca 

gaagccagtg 



ttagggcatt 


ccgaccacgt 


60 


gggagcaggt 


tccccctggt 


120 


tcagaagcca 


agaattctgg 


180 


aacagcaccc 


actgcacttg 


240 


catgacccct 


ctgagaagtg 


300 


aagtacaaag 


catatcgtag 


360 


tataccttat 


tcaactttct 


420 


aacaatagtc 


aagggacgac 


480 


ggatttggga 


aacagctggt 


540 


tcttcccgtg 


atttttaggg 


600 


aaggtcatct 


gtcaacgact 


660 


tgactggcat 


ctggcggatg 


720 


cagcctccca 


caaccaagaa 


780 


gcgtaggggg 


aaagcgcctc 


840 


aacgccacac 


gcaccttgta 


900 


caacacrtgc 


gcacatcccc 


960 






T AO n 


tacLcaacta 


atccaaagcg 


1080 


aaagggtcaa 


atttctgaat 


1140 


ggagaacgct 


cttcnggaag . 


1200 


aaaaggaaac 


ccaagagaga 


1260 


atcctgaatg 


caLccttttt 


1320 


tacgttgtga 


gcggcaccat 


1380 


ccgacttttc 


gccacaacca 


1440 


gagggaccag 


agggaggccg 


1500 


tcctacacca 


aatgcaagtg 


1560 


cccaaggagg 


accctgtgct 


1620 


tgccccttcc 


cggccaccct 


1680 


rccCwCca wC 


tagagctctc 


1740 


atcaacagaa 


ctgagcctga 


1800 


cacccggctt 


gcnccacctg 


1860 


aaccrcaagg 


nggccaacta 


1920 


gtaggccacg 


ggaccccagc 


1980 


tatggaacac 


ctactcactg 


2040 


gggccagtag 


cagtcattat 


2100 


ccgagaagca 


aac-ctcctc 


2160 


atgacactca 


aagccatttc 


2220 


tctatcgtrg 


aagaagtagg 


2280 


accaacaccc 


tccagggagt 


2340 


cgaacggaac 


ataaaaagtg 


2400 


gagacgtCwC 


gctccactac 


2460 


tt^cataaag 


gaggcactgc 


2520 


gcccatcccg 


tctccgccgc 


2580 


gacccggaag 


ctacccctcc 


2640 


cggacaaacg 


ccaccatttc 


2700 


ggct ::cc-ai: 


gcccacagaa 


2760 


gcag-gcggc 


acgcgcaacg 


2820 


aagcggattt 


tctccctgca 


2880 


gattcagctc 


acagitccct 


2940 


acatcaacac 


tccacattca 


3000 


gcgccgcatt 


ttgcaacaca 


3060 


ggaacactct 


tacactgticg 


3120 


tggcgacccc 


ccagggatct 


3180 
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agaaccagaa ataccattcg acacagctat cccatcactg ggtatacacc ccaagggcta 3240 
taaatcatgc tgctataaag acacacgcac acgcaaaaaa aaaaaaaaag ggcggccgct 3300 



r <210> 115 

<211> 1286 
<212> DNA 

I <213> Homo sapiens 

I 

' <220> 

<221> SITE 
! <222'> (1149) 

\ <223> n equals a,t,g, or c 

I <400> 115 

: gactgcttca cggacattgg atgaagccga agcatctaga atggtgcctg gcacacagtt 60 

} ggtgcgtgat atggrtaagc cttgcgcccc cacccacatc ccatcccgaa cgtgacggtt 120 

■ .tccccggctc cctcctgccg ccatgtgaag aaggtcgcrg cztccccttc accttccacc 180 

accatgatcg ccacggatgc tccccaczcc aaagcagccc rcgacagcat taacgagctg 240 

cccgagaaca tcctgctgga gctgctcacg cacgtgcccg cccgccagcc gccgccgaac 300 

cgccgcctgg tctgcagcct ccggcgggga ccccarcgac cccatgaccc tccggaaacg 360 

caatgcctgc gagagggctt catcaccaag gaccgggacc agcccgtggc cgaccggaaa .420 

atcttctatt tcctacggag ccrgcatagg aacccccgcg caacccgtgr gccgaagagg 480 

atatgtttgc atggcaaatt gattrcaatg gtggggaccg ccggaaggcg gagagcctcc 540 

ccggagccca cgggacagac ttccccgacc ccaaagtcaa gaagratttc gtcacatccc 600 

acgaaatgtg cctcaagccc cagccggtgg accctigcagc . cgagggccac cgggaggagc 660 

tactagacac attccggccg gacatcgtgg ttaaggaccg gtrcgccgcc agagccgact 720 

gtggctgcac ctaccaactic aaagcgcagc cggccccggc tgactactcc gtgtcggcct 780 

ccttcgagcc cccacctgtg accatccaac agtggaacaa Lgccacacgg acagaggtct ' 840 

• cctacacctt ctcagactac ccccggggcg cccgctacat cctctrccag catgggggca 900 

gggacaccca gtaccgggca ggctggtatg ggccccgagc caccaacagc agcattgtcg 960 

tcagccccaa gatgaccagg aaccaggcct cctccgaggc tcagcctggg cagaagcatg 1020 

gacaggagga ggctgcccaa tcgccctacc gagctgttgr ccagatttcc tgacagctgt 1080 

ccatcccgtg tctgggtcag ccagaggt::c ctccaggcag gagctgagca cggggcgggc 1140 

aagtgaggnc cctgtaccaa gcgacttccg ccccggttca acccraccaa gctcggggga 1200 

actwactgca cacagctccg acgttcrgrv gcaaiaaarg zzzzc^Lcgcc gggcaaaaaa 1260 

aaaaaaaaaa aaaaaaaaaa aaaaaa 1286 



<210> 116 
<211> 2189 

<212> .DNA 

<213> Homo sapiens 



i <400> 116 

j ggaatccggc acgaggcgcc cgaggatgcg ccgccggccg ctgctcctgc -gtgggggci 60 

! gctccccggg acggcggcgg ggggctcggg ccgaacctac ccgcaccgga ccctcctgga 120 

I cccggagggc aagtactggc tcggccggag ccagcggggc agccagaccg cczzccgccz 180 

. ccaggtgcGc actgcaggcc acgcgggcrc cggczzczcg cccaccgggg ccacggcgtc 240 

cgccgacacc gtcgtgggcg gggtiggccca cgggcggccc cacc^ccagg at'atctcac 300 

I aaatgcaaai agagagttga aaaaagargc tcagcaagat taccatctag aacacgccat 360 

I ggaaaacagc acacacacaa caatcgaatt taccagagag czgcazacat gtgacacaaa 420 

[ cgacaagagc ataacggaca gcaczgcgag agrgatcrgg gcctaccacc acgaagatgc 4 80 

I aggagaagcc ggtcccaagt accacgaccc caacaggggc accaagagct cgcggctatt 540 

I gaatcccgag aaaactagtg tgctacccac agccctacca taccttgacc tiggtaaacca 600 

I ggacgtcccc accccaaaca aacacacaac atattggngc caaatgtcta agattcctgt: 660 

( gccccaagaa aagcaticacg caanaaaggt cgagccagtg acacagagag gccacgagag 720 

j tctggtgcac cacatcccgc ::c;:accagcg cagcaacaac ctcaacgaca gcgcccctgg 780 

I aarccgggca cgaatrgcca tcaccccaac atgcccgacg cactccccac crgcgaaact 840 
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gtgactttcg 
tcccttggca 
acttatgagg 
aggaaatatg 

cc cccaggga 
gccctggaag 
ctggctggca 
gcctatgatg 
acaatcttac 
gagatgactt 
tacccaagaa 
ctcattgggg 
cccaagcaat 
aaggaaggtc 
aagacagaca 
aagaccctac 
gagattcccc 
aagagcctgt 
atctgaagta 
ttccccattt 
tctcccctct 
gcctgactta 
ttctttccta 



cctgggctat 
ctccattaga 
aaggcttaat 
atgctggggt 
tgcctgagc t 
ccgaaaagcc 
gaggcaccag 
atgattctga 
caggagataa 
ggggaggact 
ttaatctcac 
ctaaggagat 
ataaaaacct 
tctccttcaa 
atgctgagtg 
aaagcagaac 
ccatcaactt 
gatcaaaat t 
caggttaaag 
tccctccctc 
ctcttagaaa 
agacaaccac 
aaaaaaaaaa 



cggtggagag 
tccgcattat 
agataattct 
gattgaggct 
ccagtctgag 
aagtggaatt 
gctgcgtcat 
cttcaatt tc 
cctaattact 
aagcaccagg 
tcgatgtgca 
ccacagacca 
ctctttcatg 
caagctggtc 
gtcgattcca 
cctctggtgt 
gcttgcttgc 
ctgttggact 
actgtgtcca 
ctctttcctt 
tacctgacgc 
tccaaaaaac 
aaaaaciaaa 



ggct utccct 
gtgctcctag 
ggactgaggt 
ggcctctggg 
ggtcactgca 
catgtgcttg 
t ttcgaaaag 
caggagtttc 
gagtgtcgct 
agtgaaatgt 
agtattccag 
gtcacgacct- 
gatgctatga 
ctcagcctgc 
aggaatgaca 
gcggcacgtc 
Cwtctgctac 
tgacaatgtt 
cttcgggcat 
tccatgctac 
tatatacaca 
cgggctgtca 



atccacccca 
aagtccatta 
taccttacac 
tgagcctctt 
ctttggagtg 
ctgttcttcc 
ggaaggaaac 
agcatctaaa 
acaacacgaa 
gtctctcata 
acattatgga 
ggcctttcat 
ataagctcaa 
cagtgaatgt 
gcattacctc 
ctcttcctct 
tcagctgcac 
ttctatgatc 
gaagagtgtg 
atgagagaca 
tggccaacaa 
cgtgggaara 



t^gttggacta 
tgataatccc 
aatggatata 
ccataccatc 
cccggaagag 
ccatgctcac 
gaaatcacct 
ggaagaacaa 
agatagagct 
ccttctctat 
acaacttcag 
tatcaaaagt 
atggactaaa 
gagatgttcc 
cagatataga 
tccctgcaca 
gctgagcacc 
tgaacctgtc 
gagactttcc 
tcaatcaggt 
aacaaaactg 
aaagaat cct 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
18C0 
1860 
1920 
1980 
2040 
2100 
2160 

9 1 «0 



<210> 117 

<211> 1763 

<212> DNA 

<213> Homo sapiens 



<400> 117 

ggtggcctag agatgctgct gccgcggttg cagtcgtcgc gcacgcccct gcccgccagc 60 

ccgccccacc gccgtagcgc ccgagtgccg gggggcgcac ccgagtcggg ccatgaggcc 120 

gggaaccgcg ctacaggccg tgctgccggc cgtgctgctg gtggggctgc gggccgcgac 180 

g^gtcgcctg ctgagtgggc agccagtctg ccggggaggg acacagaggc ctcgttataa 240 

a^'ccatccac tcccatgata cctctcgaag actgaaccct gaggaagcca aagaagcctg ' 300 

caggagggga cggaggccag cragtcagca ccgacctccg aagacgaaca gaaaccgata 360 

gaaaakttca tcgaaaaccc cctigccatcc. gacggcgay:: cccggawtgg gcccaggagg 420 

cgcgaggaga aacaaagcaa tagcacagcc gccaggacct ccazgctcgg actgacggca 480 

gcacaccaca atttaggaac cgccatgcgg acgagccgcc crgcggcagc gaggtccgcg 540 

tggtcatgta ccatcagcca ticggcacccg ctggcaccgg aggcccctac atgctccagt 600 

ggaatgacga ccggrgcaac atigaagaaca attticactcg caaacactct gatgagaaac 660 

cagcagttcc ttctagagaa gctgaagg::g aggaaacaga gccgacaaca cccgtacttc 720 

cagaagaaac acaggaagaa gatgccaaaa aaacacttaa agaaagtaga gaagctgcct 780 

tgaatccggc ccacatccta atccccagca tzccccttcz cctcctcctt gkggtcacca 840 

cagttgtacg ttgggtttgg acctgtzagaa aaagaaaacg ggagcagcca gaccccagca 900 

caaagaagca acacaccatc tggccczczc cticaccaggg aaacagcccg gacctagagg 960 

tctacaatgt catwagaaaa caaagcgaag ctgacwcagc rgagacccgg ccagacctga 1020 

ag'aatatttc attccgagtg tgtticgggag aagccactcc cgatgacacg tcttgtgact 1080 

atgacaacat ggctgtgaac ccarcagaaa gtggg^r::gt gaccctggtg agcgtggaga 1140 

gtggat-tgt gaccaatgac attcatgagc tctccccaga ccaaatgggg aggagtaagg 1200 

agtctggacg ggtggaaaaC gaaatacatg gttattagga cacataaaaa accgaaactg 1260 

acaacaatgg aaaagaaacg acaagcaaaa tcctctcatc ctctataagg aaaacacaca 1320 

gaaggtctat gaacaagctt agatcaggcc ccgtggatga gcatgtggtc cccacgacct 1380 

cccgtcggac ccccacgttc tggctgcatc ctttatccca gccagccatc cagctcgacc 1440 

ttatgagaag gtaccntgcc caggtctggc acatagtaga gtctcaataa atgccacctg 1500 

gctggttgta tctaactttc aagggacaga gctttacctg gcagtgataa agatgggctg 1560 

tggagcttgg aaaaccacct ctgttttcct tgccctatac agcagcacat attatcatac 1620 

agacagaaaa cccagaacct ttccaaagcc cacacatggt agcacaggct ggcctgtgca 1680 
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ccggcaattc tcatacccgt cttcttcaaa gaacaaaatc aaacaaagag caggaaaaaa 1740 
aaaaaaaaaa aaaaaaactc gag 1763 

<210> 118 
<211> 1375 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> (18) 

<223> n equals a,t,g, or c 



<400> 118 

ggcgggcggc gcggaagngg cggckgcgcg gccggggcag ccacgtcgcc attgtctgcg 60 

gcgcgggcgg ccctgcgggt ctacgcggta ggcgccgcgg tgatcctggc gcagctgctg 120 

cggcgctgcc gsgggggctt cctggagcca gctycccccc cacgacctga ccgtgtcgct 180 

acagtgacgg gagggacaga cggcactggc tactctacag cgaacatctg gcgagacttg • 240 

gcacgcatgt taccatagct ggaaacaacg acagcaaagc caaacaagtc gcaagcaaaa • 300 

caaaagaaga aaccctgaac gacaaagcgg aatctttat^ ctgcgacctg gcttccatga 360 

cttccatccg gcagtttgtg cagaagtcca agatgaagaa gattcctctc cacgrcctga 420 

tcaacaatgc cggggtgacg atggtccccc agaggaaaac cagagacgga ctcgaagaac 480 

attccggcct gaactaccta gggcacttcc tgcugaccaa cctcccctcg gatacgctga 540 

aagagtctgg gtcccctggc cacagtgcga gggtggccac cgtctcctct gccacccatt 600 

acgtcgctga gctgaacatg gatgaccttc agagcagtgc ctgctactca ccccacgcag 660 

cctacgccca gagcaagctg gcccctgtcc tgctcaccca ccacccccag cggccgctgg 720 

cggccgaggg aagccacgtg accgccaacg tggtggaccc cggggtggcc aacacggacs 780 

tCLacaagca cgrgtcctgg gccacccgcc tggcgaagaa gcctctcggc tggttgcttt 840 

tcaagacccc cgatgaagga gcgtggactt ccatctacgc agcagtcacc ccagagctgg 900 

aaggagctgg tggccgttac ctacacaacg agaaagagac caagtccctc cacgtcacct 960 
acaaccagaa actgcagcag cagccgtggt ctaagagttg cgagatgact ggggtccttg . 1020 

argcgaccct gtgatatcct gtctcaggat agctgccgcc ccaagaaaca caccgcaccc 1080 

gccaatagcc cgcgggtccg tgaagaccgc ggtgtctgag tttctcacac ccacctgccc 1140 

acagggctct gtcctctagt tttgagacag cngcctcaac ccctgcagaa cttcaagaag 1200 

ccaaataaac attttggagg acaaccaccc caagcggtct ccaaccacaa actttgtgat 1260 

tccaaagtgc ccagttgtca caggngccac aaacaactac accttccaac acaaaaaaaa 1320 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaagggcg gccgc 1375 



<210> 119 
<211> 1022 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> (937) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (990) 

<223> n equals a,t:.g, or c 



<400> 119 

ggcacgagag agccrgcggc ccgagccggg gccacggcga actgctgagc tgcgtcctag 60 

gcccccggct ctacaaaatc cacccggaga gggactccga aagggccccg gccagcgtcc 120 

ccgagacgcc aacggcag::c actgcccccc attccagccc ctgggacacg caccatcagc 180 
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. cccgtgcccc ggagaaacat gctgacagca tcctggcact ggcttcagta tuctggticca 240 

tctcttatta ctcctctccc ttcgccttcc ::ctacttgta caggaaaggt tacctgagtt 300 

tgnccaaagt ggtgccgttt tctcactatg ctgggacatt gctgctactt ctggcaggcg 360 

tggcctgctc cgaggcactg gccgctggac caacccccag taccggcagt tcatcaccat 420 

cttggaagca acacatcgga accagtcttc agaaaacaag aggcagcctg ccaactacaa 480 

ctttgacttc cggagctggc cagtcgactt ccactgggaa gaacccagca gccggaagga 540 

gtctcgaggg ggcccztccc gccggggtgt ggccctgcct cgcccagagc ccctgcaccg 600 

ggggacagca gacaccctcc tcaaccgggt taagaagctg ccttgtcaga tcaccagcta 660 

cctggtggcg cacaccctag ggcgccggat gccgtatcca ggctctgtgt acctgctgca 720 

gaaggccctc atgcctgcgc cgctgcaggg ccaggcccga ctggcggaag agtgtaacgg 780 

gcgccgggca aagctgctgg cccgcgatgg caatgagatt gacaccatgt ttgtggaccg 840 

gcgggggaca gccgagcccc aggacagaag ctggtgatcc gctgcgargg gaatgctggg 900 

ttttatgarg tgggctgcgt ccccamgccc ctggaanctg gatactcatc "ckgggtggaa 960 

tcatccagct ttgctggaac acagggggtn ccattcccgc aaaacgaagg ctaatgccat 1020 

gg 1022 



<210> 120 

<211> 2311 

<212> DNA 

<213> Homo sapiens 



<221> SITE 
<222> (654) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (2293) 

<223> n equals a,t,g, or c 



<400> 120 

tcgacccacg cgtccgcttc ctcacgcngc ccgctcgcct accgttcagg ccgccgagcc 60 

tttcccctcg tggatccgcc cccacggcag cgcgccatgg ccrccgggag ccgctcctgg 120 

agaggaggtg cgctgccgcc tcczccttcc agcactcatc gaccctggga cgcgagcttc 180 

ccta'tgaccc cgtggacacg gagggccttg gagaaggcgg cgacacgcag gagcgtttcc 240 

tgttcccgga gcacatcctg gatccggagc cgcaacccac ccgcgaaaag cagccgcagg 300 

agctccagca acagcaggag gaggaggagc gacagaggca gcagcggcgg gaggagcggc 360 

gacagcaaaa cccacgggcc aggccccggg agcacccggc cgrggggcac ccggacccgg 42 0 

cactgccgcc cagcggcgtg aaccgcccgg gcrgcggggc asasccgcac cgccaggacg 480 

ccggagtgcc cggctacc^g ccccgagaga agtccccccg cacggcggag gcagacggcg 540 

ggctggcacg gaccgcgcgc cagcgccgc:: ggctgctgtc gcaccaccgg cgcgctctac 600 

gcc.tgcaggt gagccgcgag cagcacccgg agctggcgag cgccgcgctg cggnggcccg 660 

gcccctccct ggtgctccac aiggtggacc cgctggacct gcccgacgcc ctgctgcccg 720- 

acttgcccgc gcrggtigggc cccaagcagc ngatcgcgct gggaaacaaa gcggacctcc 780 

tgccccagga tgctcctggc taccggcaga ggccgcggga gcgaccgtgg gaggactgtg 840 

cccgcgccgg gctcctgccg gcccctggca ccaagggcca cagcgccccg tcaaggacga 900 

gccacaggac ggggagaarc cgaanccgcc gaactggccc cgcacagtgg ccagggacgt 960 

gcggctgatc agcgccaaga ccggctacgg agtggaagag crga::ccct:g cccctcagcg 1020 

cccccggcgc caccgtgggg acgtczaczz agcgggcgcc accaacgccg gcaaacccac 1080 

tctctLtaac acgcccctgg agtccgactia czgcaczgzc aaggcctccg aggccaccga 1140 

cagagccacc acctccccct: ggccaggcac cacaccaaac cctccgaagc ctcctactcg 1200 

caacccaact ccccacagaa tgt-itaaaag gcaccaaaga crcaaaaaag acccaaccca 1260 

agctgaagaa gatctnagtg agcaagaaca aaaccagcct aatgccccca aaaagcatgg 132 0 

ttacgtcgtza ggaagagctg gaaggacatt cttgcaccca gaagaacaga aggataacat 1380 

tccctttgag tctgacgccg atccacccgc ctttgacacg gaaaacgacc czgttacggg 1440 

tacacacaaa tccaccaaac aagcagaatt gactccacaa gacgcgaaag acgcccaccg 1500 

gctttatgac acccccggaa ttacaaaaga aaattgcact tcaaatcntc caacagaaaa 1560 
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agaagcaaac attgtcttgc caacacagtc cattgtccca agaacttttg tgcctaaacc 1620 

aggaatggtt ctgtttttgg gtgctatagg ccgcatagat ttcccgcagg gaaatcagtc 1680 

agcttggtcc acagtcgtgg cttccaacat cctccctgtg cataccacct ccttggacag 1740 

ggcagacgcc ctgtatcaga agcatgcagg tcatacgtta ctccagattc caatgggtgg 1800 

aaaagaacga atggcrggat ttcctcctcc cgttgccgaa gacattatgt taaaagaagg 1860 

actgggggca tctgaagcag tggccgacat caagttctcc tctgcaggtt gggtttcagt 1920 

aacacctaac tttaaggaca gactgcatct ccgaggccat acacctgaag gaacagtttt 1980 

gaccgtccgg ccccctctct tgccatatat tgctaacatc aaaggacagc gcatcaagaa 2040 

aagtgtggcc tataaaacca agaagccccc ttcccttatg tacaacgtga ggaagaagaa 2100 

aggaaagata aatgtacgag accgaccttg tccactccag atatcaactg tattgaacac 2160 

aacaaaatac attgaatttg tattaaacat acaacgcata aataaagctc ccattcttac 2220 

ccttaaaaaa aaaaaaaaag ggcggccgct ctagaggatc caagcttacg tacgcgtgca 22 80 

tgcgacgtca tanctcgtct ataggaactg g 2311 



<210> 121 

<211> 1286 

<212> DNA 

<213> Homo sapiens 

<220> 

<22i> SITE 

<222> (1284) 

<223> n equals a,t,g, or c 



<400> 121 

gggcgcgcgg gtgaaaggcg cattgacgca gcctgcggcg gcctcggagc gcggcggagc 60 

agacgctgac cacgtccctc tccccggcct cctccgcczc cagctccgcg ctgcccggca 120 

gccgggagcc atgcgacccc agggccccgc cgcctccccg cagcggctcc gcggcctcct 180 

gccgcccctg ctgccgcagc tgcccgcgcc gccgagcgcc tctgagatcc ccaaggggaa 240 

gcaaaaggcg cactccggca gagggaggtg gtggacctgt acaacggaac gtgcttacaa 300 

gggccagcag gagtgcctgg tcgagacggg agccccgggg ccaacggcac tccgggtaca 360 

cctgggatcc caggtcggga tggatccaaa ggagaaaagg gggaargtc:: gagggaaagc 420 

tttgaggagt cctggacacc caactacaag cagtgttcac ggagtccatt: gaactatggc 480 

atagatcttg ggaaaactgc ggagngtaca tctacaaaga rgcgttcaaa tagtgctcta 540 

agagctttgt tcagtggctc acttcggctia aaacgcagaa atgcat:gc~g tcagcgttgg 600 

tattccacat tcaatggagc tgaatgctca ggacctcttc ccactgaagc tacaatttat 660 

ctggaccaag gaagccccga aacgaattca acaatcaaza ">caccgcac tcctcctgcg 720 

gaaggactct gcgaaggaat cggcgctgga cragcggatg ctgcratccg ggrtggcact 780 

tgttcagatn acccaaaagg agacgctcct: accggacgga acccagiccc ncgcaccatc 840 

attgaagaac taccaaaata aacgcttcaa ttttcactcg ctacctcttt: tttcatitatg 900 

ccrtggaacg gttcactcaa acgacattiit aaataagntc acgtacacac ccgaatgaaa 960 

agcaaagcta aatacgttta cagaccaaag tgtigatttca caccgrtittic aaacccagca 1020 

ttattcattt cgctt:caatc aaaagtggtt ccaanacctt ttctagtcgg ttagaatact 1080 

ttcttcatag tcacattctc tcaacctaca atttggaata trgcrgcggt cccttgtctt 1140 

ttctctcagc atagcarrtt taaaaaaata caaaagcnac caacc::ttgt acaaczzgza 1200 

aatgctaaga attttcttta tatcrgttaa acaaaaat::a r.t-ccaacaa aaaaaaaaaa 1260 

aaaaaaaaaa aaaaaaaaaa aaanaa 1286 



<210> 122 
<211> 1380 
<212> DNA 

<213> Kono sapiens 



<400> 122 

cagacccgcg gggcaaacgg accggggcca agagccggga gcgcgggcgc aaaggcacca 60 

gggcccgccc agggcgccgc gcacacggcc ttgggggccc tgcgggcctt cgggcgcgcg 120 

tctcgcctcc agccacgggg tccgcagcgc cggagaccct gggcccggtg ccgcgcccgg 180 
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tgggctgggg gggtctgatc ccggcgcgcg ggccgcccat gtggcaggtg accgccttcc 240 

tggaccacaa catcgcgacg gcgcagacca cctggaaggg gccgtggatg ticgcgcgtgg 300 

tgcagagcac gggcacatgc agcgcaaagt gcacgactcg gtgccggctc tgagcaccga 360 

ggtgcaggcg gcgcgggcgc ccaccgtgag cgccgtgctg ccggcgttcg ttgcgctctt 420 

cgcgaccctg gcgggcgcgc agtgcaccac ccgcgtggcc ccgggcccgg ccaaggcgcg 480 

tgtggccctc acgggaggcg tgccccaccc gttttgcggg ccgctggcgc tcgtgccact 540 

ctgctggctc gccaacattg tcgrccgcga gtrttacgac ccgcctgcgc ccgcgtcgca 600 

gaagtacgag ctgggcgcac gctgcacatc ggccgggcgg ccaccgcgcc gctcatggta 660 

ggcggctgcc tcctgtgctg cggcgcccgg gtccgcaccg gccgtcccga cctcagcttc 720 

cccgtgaagc actcagcgcc gcggcggccc acggccaccg gcgactacga caagaagaac 780 

tacgtctgag ggcgctgggc acggccgggc cccccctgcc agccacgcct gcgaggcgtt 840 

ggataagccc ggggagcccc gcatggaccg cggcttccgc cgggtagcgc ggcgcgcagg 900 

ctcctcggaa cgtccggctc tgcgccccga cgcggctcct ggatccgctc ctgcctgcgc 960 

ccgcagctga ccttctcctg ccaccagccc ggccccgccc ttaacagacg gaatgaagtt 1020 

tccttttctg tgcgcggcgc tgttcccaca ggcagagcgg gcgtcagact gaggatttcg 1080 

cttcccctcc aagacgctgg gggtcttggc tgctgcctca ccccccagag gcccccgctg 1140 

actccggagg ggcggatgca gagcccaggg cccccaccgg aagatgtgca cacctggtct 1200 

ttactccatc ggcagggccc gagcccaggg accagtgact cggcccggac ctcccggtct 1260 

caccccagca tccccccagg caaggcttgt gggcaccgga gcttgagaga gggcgggagt 1320 

gggaaggcta agaacctgct tagcaaacgc tttgaaccct caaaaaaaaa aaaaaaaaaa 1380 



<210> 123 

<211> 3793 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> SITE 

<222> (1102) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 

<222> (1132) 

<223> n equals a,t,g, or c 

<220> . 
<221> SITE 
<222> (1199) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE 
<222> (1228) 

<223> n equals a,t,g, or c 

<220> 

<221> SITE 
<222> (1229) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 
<222> (1231) 

<223> n equals a,t,g, or c 
<220> 

<221> SITE 



